Sunday, March 21, 2010

FetchType.EAGER - Cannot simultaneously fetch multiple bags

I just came across a quite interesting problem that might give you a hint on how hibernate internally works. Assume you have the following object association:


  • A references B

  • A has a one-to-many relationship with C and the fetch type is set to FetchType.EAGER

  • B has a one-to-many relationship with D and the fetch type is set to FetchType.EAGER



If you try to initialize hibernate using such a layout, you will receive an error message saying something like cannot simultaneously fetch multiple bags.

Since my mappings were okay until I added a test case for B and the relationships of B were correct on first sight, I started to search for a possible reason. I came across different solutions, among them I found this one from Eyal Lupu. It did not directly solve my problem, but let me analyze the code from a different point of view.

Eyal writes, that it is not allowed to have two mapped lists in one entity being marked with FetchType.EAGER because the sql statements generated would lead to an incorrect loading of the desired entity. If you look at my example above, there is just one list per entity to be loaded eagerly, so why does it fail? The solution to that problem is the first statement of my example: A references B.

If you refer from one to another entity without further fetch type definition, hibernate will always apply FetchType.EAGER. That again leads to a join operation between the relations that contain A and B. Since the relationships between A - C and B - D are also marked to be loaded eagerly, they will be incorporated into the sql statement creation as well. Guess what operation will be used to compute the entities of the relationship: JOIN. And that leads us back to Eyals posting.

Eyal suggests to add an @IndexColumn annotation which replaces the bag semantics with list semantics that gives hibernate a hint on how to handle elements of that type. Another solution - and that's the one I chose - is to mark the connection between A and B as begin fetched lazy. In case you need the relationship between A and B being resolved on entity loading, a third solution would be to mark one or both of the one-to-many relationships as being fetched lazy.

Next to the FetchType.EAGER problem it is a good idea to have some of the relationships mentioned in the example above marked as being loaded lazy. Loading all associated objects eagerly and having a relationship that for example represent a chain of ancestors would leave you with a large set of objects loaded although you need just only one.

Monday, March 8, 2010

Object change tracking

Assume you have the task to write an audit proof software that could be easily resetted into a previous, fully
operable state for all objects managed by the application. I am going to describe the approach how I solved the
issue for my assignment.

The reason for resetting a software system into a previous state is for example the ability to trace back
changes made at a specific time by a know or unknown user. Usually features like this are implemented through a
logging component having a more or less high resolution. Unfortunately the information captured only give a very
limited view on the context the changes were made in. Writing all required information into a single log entry
would be a quick but brute force solution and would let appear the log to be unreadable to auditors.

A second aspect - and that's the driving force beside the traceability of changes in my project - is the
possibility to have an automatic replay of all changes made to objects between two defined timestamps and the
provisioning of these modifications into attached sub-systems. These tasks cannot be achieved through a simple
logging mechanism since next to the data directly involded into certain actions, transitively affected information
within the context of these objects are required as well. Therefore a more complex and more complete solution
must be found which still is managable on all system levels - from abstract ui level to low database level.

A very common approach for recording changes within datasets is to save delta information representing the
transition between two states instead of creating a copy of the original state adding the changes made by the
process. The latter would lead in my case where I comprehend the whole object context as dataset to a redundant
enlargement of the whole database.

To avoid redundant information and an explosion of managed data objects, I decided to follow the idea of saving
only delta information covering the named dataset.

Within in my project, I do have to manage a lot of objects which are linked among each other and where changing a
linked objects attribute value could lead automatically to a semantic change of the originating object. Since the
changed object could be linked by n other objects, a complete and stable approach in saving the state
transition would be the storage of the changed object as well as all linking objects. This would not be as many
data as writing a whole copy of the objects context but it produces unecessary redundant information.

I decided to follow a much easier and quite simple way to solve this problem. Whenever an object changes, itself
and its ancestors can be identified for any time through the unique object id and a version stamp. All
links between objects are designed such that they reference the unique object id as well as the version stamp.

If someone changes an object, a new version is created whereas all existing links point to the version controlled
ancestor. Other objects that will be linked with the changed one in the future only see the most current version.
This leads to an overall consistent state.

But this concept is too basic to intercept all requirements. Sometimes changes to object attribute values are of
such a minor state that they would have no influence on the link semantics at all. Therefore it must be possible
to mark changes in such way that no version is created but the most current one is modified.

There is indeed the other hand extreme as well. Sometimes changes made to objects have such an impact that they
must be expanded to all linked objects too. In this case all links to former versions are upgraded to the most
current one.