How standard is Standard UML?

(From Martin Fowler's column originally published in Distributed Computing, Spring 1999)

A few years ago, one of the most tiresome aspects of my consulting life was the methodology wars: furious arguments about which notation to use. Always these rows caused a lot of heat, and almost all the time they were pointless - it didn't matter which we picked. Thanks to the UML, most of that is gone and forgotten. We now have a standard notation. But as people are discovering, the standard doesn't settle all the questions.

The questions that rise up now tend to be more detailed issues: how to represent some particular thing in a UML model, or what a certain UML construct means for our implementation environment. The thing that surprises many people is that for many of these issues, there is no cut and dried answer.

Why is this? A large part of the reason is tradition. Modeling methods, whether OO or not, were never very formal affaires. Definitions were left to intuition, and there were many gaps. People with a formal methods background criticized these methods for this. They argued that the lack of rigor meant that too much ambiguity crept into the models. But the reality is that many people found that despite this lack of formality, the informal methods were generally more useful than the formal methods. In practice it seems that informality is an advantage.

There are formal elements to the UML. The center of these is the meta-model: a UML model that describes the structure of the UML. But even a meta-model leaves many gaps. It may define an attribute as a kind of structural feature whose type is a classifier, but what does that really mean when I program?  In Java does it mean a field, a pair of get and set operations, or a parameter to an operation? If you ask ten UML experts what the answer to that question is, the only agreement you're likely to get is that the answer includes the words "it depends". The meta-model's primary purpose is to define a well-formed model so that CASE tools can someday communicate. It's a long way from a truly rigorous approach.

Dealing with multiple interpretations

So what does this mean for you? For a start it means that you'll see a number of interpretations of the UML. These interpretations will appear in books, training courses, and project documentation. You need to be aware of this. There is much less variation than the variations between methods in the past, but variation will be there. Within a team you'll have to come up with a more standard consensus interpretation for these issues as they come up. As you do this, beware of anyone who states with great certainty about how one interpretation is right or wrong. Some things are fairly cut and dried, but many are not.

How do you choose an interpretation? The overriding principle is to understand what you are using the UML for. If you are using a CASE tool to generate code, then the interpretation to use will inevitably be that of the CASE tool's code generator. If not then the overriding purpose will be that of communication. So you want to choose an interpretation that will communicate best with other people. This is an important point about the standard, it encourages a common interpretations of our diagrams.

Here I think the issues of interpretation will parallel that of a natural language (like English) rather than a defined language (like ANSI C). There won't be one single body that will pronounce on UML interpretations. If such a body does exist, it is the OMG: through its analysis and design task force. It holds the key to the UML standard. But they will never pronounce on all the issues. And even where they do pronounce, common practice may well differ from the official line. It's similar to that of the role of the French language academy that outlawed such phrases as "faire du shopping", but couldn't stop them passing into common parlance.

So just as with English, an important part of steering common usage will come from the UML mavens, those people who set themselves up to comment on correct and tasteful UML. In this picture the three amigos will be particularly important mavens. Their writings will carry a lot of weight, but won't necessarily be absolute.

As well as variations in interpretation, we will also find downright different notations. Many people will perceive an important gap in the UML and propose some new extensions to fix the problem. Some of these will fit in smoothly through the UML extension features (stereotypes and the like), others will be outright deviations from the OMG documents. Again these will rise and fall depending on whether other people pick them up.

Choosing Interpretations

So when you are using the UML to communicate, and you are choosing an interpretation or considering an extension - how should you choose? Look at the influential sources and see if they say anything. If they do, that should influence your decision, but it shouldn't settle the decision. In the end you have to choose what is most useful for you. It may be that one interpretation makes much more sense for your project - it leads to less clutter, is more visually apparent. You will have to make clear to your readers that you are taking a less traveled course, but sometimes it can be worthwhile. So try to stick to the common usage, but don't let these pronouncements rule you. The UML is there to help you, not the other way round.

This view of the UML as a natural language also suggests how it may evolve. Natural languages evolve by people pushing the language in various directions. Those that are useful pass into common usage, and then gradually become more standard, those that don't fade into memory. I think we'll see a fair bit of pushing at the UML's edges over the next few years. This will be a good thing, for it will keep the UML vibrant and allow it to develop. This edge-pushing happened with pre-UML methods, and many ideas found their way into the UML. The fact that the variations have a common base will make this process easier, and perhaps even accelerate the UML. So don't expect it to be static, and do expect the official standard to be a couple of steps behind the best usage.
The impact on CASE tools

CASE tools have a particular role to play in this standards world, and their decisions will affect the way you use the UML. Essentially CASE tools must make a choice about how they will play the UML game. The great advantage of the UML for CASE industry is that it frees them from the necessity of supporting multiple methods. They can now focus their energies on doing something useful with the UML.

In order to support the UML they have to support the UML meta-model, but all the interpretation issues still apply. As an example we'll see different decisions made with code generators. Not just will they generate different code, they'll make different decisions about the intention of the code they generate. So you'll find that the UML in one case tools doesn't mean the same as the UML in another.

CASE tools talk about their ability to check your diagrams for you, to prevent you from making mistakes. They see it as a feature, but I'm not so sure. I know what I want to do, and I find it very irritating when a CASE tools doesn't let me do something I need. People see it as a boon for less experienced developers, but I think that a good review process is far more valuable than mechanical checking. Tools miss the subtleties of communication. So don't let CASE tools be the arbiter of what you draw. Your brain can beat a CASE tools any day.