Monday, March 8, 2010

Object change tracking

Assume you have the task to write an audit proof software that could be easily resetted into a previous, fully
operable state for all objects managed by the application. I am going to describe the approach how I solved the
issue for my assignment.

The reason for resetting a software system into a previous state is for example the ability to trace back
changes made at a specific time by a know or unknown user. Usually features like this are implemented through a
logging component having a more or less high resolution. Unfortunately the information captured only give a very
limited view on the context the changes were made in. Writing all required information into a single log entry
would be a quick but brute force solution and would let appear the log to be unreadable to auditors.

A second aspect - and that's the driving force beside the traceability of changes in my project - is the
possibility to have an automatic replay of all changes made to objects between two defined timestamps and the
provisioning of these modifications into attached sub-systems. These tasks cannot be achieved through a simple
logging mechanism since next to the data directly involded into certain actions, transitively affected information
within the context of these objects are required as well. Therefore a more complex and more complete solution
must be found which still is managable on all system levels - from abstract ui level to low database level.

A very common approach for recording changes within datasets is to save delta information representing the
transition between two states instead of creating a copy of the original state adding the changes made by the
process. The latter would lead in my case where I comprehend the whole object context as dataset to a redundant
enlargement of the whole database.

To avoid redundant information and an explosion of managed data objects, I decided to follow the idea of saving
only delta information covering the named dataset.

Within in my project, I do have to manage a lot of objects which are linked among each other and where changing a
linked objects attribute value could lead automatically to a semantic change of the originating object. Since the
changed object could be linked by n other objects, a complete and stable approach in saving the state
transition would be the storage of the changed object as well as all linking objects. This would not be as many
data as writing a whole copy of the objects context but it produces unecessary redundant information.

I decided to follow a much easier and quite simple way to solve this problem. Whenever an object changes, itself
and its ancestors can be identified for any time through the unique object id and a version stamp. All
links between objects are designed such that they reference the unique object id as well as the version stamp.

If someone changes an object, a new version is created whereas all existing links point to the version controlled
ancestor. Other objects that will be linked with the changed one in the future only see the most current version.
This leads to an overall consistent state.

But this concept is too basic to intercept all requirements. Sometimes changes to object attribute values are of
such a minor state that they would have no influence on the link semantics at all. Therefore it must be possible
to mark changes in such way that no version is created but the most current one is modified.

There is indeed the other hand extreme as well. Sometimes changes made to objects have such an impact that they
must be expanded to all linked objects too. In this case all links to former versions are upgraded to the most
current one.

No comments:

Post a Comment