martinfowler.com logo Home Blog Articles Books About Me Contact Me ThoughtWorks

ReportingDatabase design 2 April 2004 Reactions

If I'm using a domain model, how do I support ad hoc SQL queries?

Part of the point of a domain model is that it adds significant behavior to the data in the application. If you want reports against that data, the domain model can often do a great deal to help. However many reporting tools exist that expect to talk to a SQL database - they just can't deal with domain models. So what is one to do?

The first thing to do is to question the need for ad hoc reports. Too often a request for ad hoc reports is merely a symptom that nobody's bothered to dig into the requirements properly. By doing so you main find that the reports are rather more constrained than you thought and they can be supported perfectly well by writing code against the domain model. A lot of the time the real need is for reports to be produced quickly, the customer doesn't care how; and isn't inclined to type in SQL themselves anyway.

This isn't true in all cases. Sometimes there are power users who are happy to use some SQL based reporting tool and want to use that directly. In which case a good strategy is to produce a reporting database. Such a database is a separate database to the one that actually holds the operational data. This database is populated by code run against the domain model. and can thus insert derived data from the domain model into the reporting database. This has several advantages.

  • Since it's populated via the domain model, you can derive data in the domain model and put it in the reporting database.
  • You don't need to normalize a reporting database, because it's read-only.
  • The structure of the reporting database can be specifically designed to make it easier to write reports.
  • The development team can refactor the operational database without affecting the reporting database.
  • Queries run against the reporting database won't affect the performance of the operational database.

The downside to a reporting database is that the data has to be kept up to date, timeliness may cause complications. The easiest case is when you do something like use an overnight run to populate the reporting database. This often works quite well since many reporting needs work perfectly well with yesterday's data. If you need more timely data you can use a messaging system so that any changes to the operational database are forwarded to the reporting database. This is more complicated, but the data can be kept fresher. Often most reports can use slightly stale data and you can produce special case reports for things that really need to have this second's data.

A variation on this is to use views. This encapsulates the operational data and allows you to denormalize. It doesn't allow you to separate the operational load from the reporting load. More seriously you are limited to what views can derive and you can't take advantage of behavior in a domain model.

Although I've motivated reporting databases with a domain model example, this approach is also applicable for any case where you want to encapsulate your database, which many people see as one of the purposes of a Service Oriented Architecture.

See some further discussion summarized by Eric Evans.


Links
home
bliki
feed 
Translations
Japanese
Spanish
Korean
Chinese
Thai
Categories
agile
design
dsl
leisure
refactoring
ruby
thoughtWorks
tools
uml
writing
Blog Roll
ThoughtBlogs
TW Alumni
Nicholas Carr
Steve Cook
Brian Foote
Simon Harris
Gregor Hohpe
/\ndy Hunt
Ralph Johnson
Patrick Logan
David Ing
Brian Marick
Jeremy Miller
Jimmy Nilsson
Samuel Pepys
Keith Ray
Johanna Rothman
Kathy Sierra
Dave Thomas

martinfowler.com logo mingle logo thoughtworks logo

© Copyright Martin Fowler, all rights reserved