ReportingDatabase

2 April 2004

database · application architecture

tags:

If I'm using a domain model, how do I support ad hoc SQL queries?

Part of the point of a domain model is that it adds significant behavior to the data in the application. If you want reports against that data, the domain model can often do a great deal to help. However many reporting tools exist that expect to talk to a SQL database - they just can't deal with domain models. So what is one to do?

The first thing to do is to question the need for ad hoc reports. Too often a request for ad hoc reports is merely a symptom that nobody's bothered to dig into the requirements properly. By doing so you main find that the reports are rather more constrained than you thought and they can be supported perfectly well by writing code against the domain model. A lot of the time the real need is for reports to be produced quickly, the customer doesn't care how; and isn't inclined to type in SQL themselves anyway.

This isn't true in all cases. Sometimes there are power users who are happy to use some SQL based reporting tool and want to use that directly. In which case a good strategy is to produce a reporting database. Such a database is a separate database to the one that actually holds the operational data. This database is populated by code run against the domain model. and can thus insert derived data from the domain model into the reporting database. This has several advantages.

The downside to a reporting database is that the data has to be kept up to date, timeliness may cause complications. The easiest case is when you do something like use an overnight run to populate the reporting database. This often works quite well since many reporting needs work perfectly well with yesterday's data. If you need more timely data you can use a messaging system so that any changes to the operational database are forwarded to the reporting database. This is more complicated, but the data can be kept fresher. Often most reports can use slightly stale data and you can produce special case reports for things that really need to have this second's data.

A variation on this is to use views. This encapsulates the operational data and allows you to denormalize. It doesn't allow you to separate the operational load from the reporting load. More seriously you are limited to what views can derive and you can't take advantage of behavior in a domain model.

Although I've motivated reporting databases with a domain model example, this approach is also applicable for any case where you want to encapsulate your database, which many people see as one of the purposes of a Service Oriented Architecture.

See some further discussion summarized by Eric Evans.