Tuesday 2 February 2010

Using database views in Hibernate (or, anything view can do, HQL can do better)

It's very easy to map database views using Hibernate, but there are a few considerations you might want to er... consider before you do it. I suspect the main reason people do it is for performance, but I think that a lot of the gains are not that great and might stem from lack of knowledge about what HQL can do.
Certainly, loading an entity and then iterating through child entities to calculate aggregate values is not going to perform as well as the equivalent view. However, as I will explain below you may be able to achieve the same result using the equivalent features in HQL.
Finally, I do admit that some databases enable indexing of views and that might be faster than the equivalent operations in Hibernate as a result.
First, I'll just look at how you go about mapping a view and then discuss some of the details a bit later.
You can map a view exactly as if it was a table. Just put the view name in the table attribute. Optionally, you can specify that the class is immutable as in most cases you won't be able to update a view-backed entity.

 
 ...
 
The first consideration with this is portability. Not all databases support views so you would be limited to supporting those that do.
However, you can use the subselect element to largely the same effect, although obviously this wouldn't benefit from indexing in the same way that a "proper" view might.

 
     
       select p.Name, sum(s.Bits) from Project p inner join Stuff s etc
     
 ...
 
However, at this point you may need to consider how your views will interact with the 2nd-level cache. The good news is that you can enable this as you would a normal entity and you can cache HQL queries on this view. I tend to do this in the Hibernate configuration file. Note again that it's worth specifying that the usage is "read-only".

The bad news is that you may get some unexpected results on your cached views when you update the tables that are joined by the view. Basically, the view queries won't reflect the current contents of the database immediately as the level 2 cache will hold stale data. Why is this data not evicted from the cache? Because Hibernate can't really guess which entities are related to your view unless you tell it.
You can let Hibernate know which other tables are linked to your view as below:

 
     <synchronize table="dbo.Project" />
     <synchronize table="dbo.Stuff" />
     <synchronize table="dbo.Things" />
     ...
 
One thing to watch for is that Hibernate is very fussy about how you specify the table attribute. It is case sensitive, and in the case of Microsoft SQL Server 2005 also requires the name to be qualified with the schema. If you get it wrong, Hibernate doesn't complain - but it also won't automatically evict your views from level 2 cache. You just need to fail your way to success, unfortunately.
By now you have got your views working with Hibernate and it is playing nice with the level 2 cache.
But did you need to do it in the first place?
I can think of two scenarios where a view might be used. One would be to conveniently load a "flattened" representation of an object graph in one efficient join query, thereby avoiding the famous "N+1 query" problem of loading object graphs in Hibernate. Two would be to calculate aggregate values on an object graph (such as a sum of a field in a child object), again in a single efficient query.
However, both of these things can be achieved in HQL. The first can be achieved using a "join fetch" query, that causes an object graph to be loaded using a single join query. The second can be achieved using the support for aggregate functions in HQL that can return scalar values in a result set.
Here is how to "eagerly" join fetch a child object using HQL, which is done in a single query.
select o from Timesheet o left join fetch o.activities where o.employee.id=? and o.date=? 
Aggregates can be easily calculated on child objects using a HQL query like this:
select timesheet, sum(activity.hours) from Timesheet timesheet join timesheet.activities activity where timesheet.timesheetId = 1 group by timesheet.timesheetId, timesheet.date, timesheet.employee.employeeId
This query needs slightly different handling in code as it returns a List of Object[] arrays, rather than a List of entities.
Query query = session.createQuery("select timesheet, sum(activity.hours) from Timesheet timesheet join timesheet.activities activity where timesheet.timesheetId = 1 group by timesheet.timesheetId, timesheet.date, timesheet.employee.employeeId");
List<Object[]> results = query.list();
for (Object[] result : results) {
  Timesheet timesheet = (TimesheetData)result[0];
  BigDecimal hours = (BigDecimal)result[1];
}
So my conclusion - views can be used in Hibernate and can be made to work well with the level 2 cache. But I would strongly advise not doing it if you can avoid it, as it "clutter" your domain model with redundant psuedo-entities. It's like the object-oriented version of denormalised data!

2 comments:

  1. Thank you very much! This article has shed light on my problem with 2nd level caching of a view and in general it just spoon-feeds such a quite difficult topic

    ReplyDelete