Martin Fowler's Bliki
A cross between a blog and wiki of my partly-formed ideas on software development
| SchoolsOfSoftwareDevelopment |
agile |
12 April 2008 |
Reactions |
|
For nth, and I'm sure not last time, I'm sliding into a
conversation about defining practices, labeling some of them as
"best", and probably the C-word (certification). It's a familiar
discussion, and although we've barely started it, I can predict much
of where it will go. It's driven by a perfectly reasonable desire to
identify who are the better software developers, and how existing
developers can improve their abilities. When people get into these conversations, they usually end up in
trouble. Either the group gets into heated discussions and cracks up,
or the group doesn't have heated discussions and produces something
that others deride. The heart of why this happens, and why I don't see
any single, widely-recognized certification program for software
development coming soon is that there is no single, well-agreed way to
develop software effectively. Instead what we see is a situation where there are several schools
of software development, each with its own definitions and statements
of good practice. As a profession we need to recognize that multiple
schools exist, and their approach to software development development
is quite different. Different to the point that what one school
considers to exemplary is considered by other schools to be
incompetent. Furthermore, we don't know which schools are right (in
part because we CannotMeasureProductivity) although each
school thinks of itself as right, with varying degrees of tolerance
for the others. I'm using "school" here in the style of this definition:
4 a: a group of persons who hold a common doctrine or follow the
same teacher (as in philosophy, theology, or medicine) <the
Aristotelian school>; also : the doctrine or practice of such a group
b: a group of artists under a common influence c: a group of persons
of similar opinions or behavior; also : the shared opinions or
behavior of such a group <other schools of thought>
--Merriam-Webster
I came across this notion explicitly from the Context-Driven
School of Software Testing (see James Bach and Brett
Pettichord). I like their way of looking at this because it's a
model that explains why intelligent software developers have such different
approaches. The Context-Driven folks have done some looking at different
schools within the testing world, but I don't know of any good attempt
to classify the schools within the broader world of software
development. I feel a sense of belonging to a school, one that for me
is rooted in the people I met through OOPSLA in the
90's. Object-orientation is a key practice of this school, as is agile
methods. You could reasonably argue that this is the agile school,
except I think that agile methods are a core component of this
school's thinking but not the whole picture. The leaders of this
school include people like Ward Cunningham, Ralph Johnson, Kent Beck,
and Robert Martin. ThoughtWorks is, on the whole, an organization that
follows this school (which is why I'm comfortable here). But despite this sense of a somewhat coherent school, there's still
many open questions. Is it best to think of the agile world as one
school or many (are Scrum and XP different schools or part of the
same)? What are the major schools out there? What exactly defines a
school of thought? I don't have much of an answer to these questions, but the key
point to remember is that there are multiple schools of thought about
how to develop software effectively. We may not think much of the
other schools to our particular one, but we are foolish not to
recognize that other schools exist.
|
| UpcomingTalks |
writing |
8 April 2008 |
Reactions |
|
My next appearance at a conference will be at the JAOO conference
in Australia - specifically Brisbane and Sydney. This is JAOO's first
venture into Australia and I have high hopes for it. One thing I've
learned from my recent visits to our offices in Oz is that Australia
really lacks for decent conferences. JAOO does an excellent job, so it
should do well. I have three talks on my list, two of which I'll be doing with my
colleague Erik
Dörnenburg.
The first one to mention is our tutorial
on Test Driven Development. We've done this a couple of times in the
past. It's a very informal tutorial - Erik codes up an example from
scratch using TDD which we use as a vehicle to explaining how the
process works. We show the basic red-green-refactor cycle, and go into
the use of state and interaction based verification. It's very much an
introductory tutorial, aimed at people who haven't done TDD before. Our second effort will be a keynote: on Simplicity in Design. Erik
is both a proponent and exponent of simple architectural designs, as
shown by his work for The Guardian. The problem with simplicity is
that it's a complicated subject to talk about, but I think we shall be
able to give some useful principles to think about. I'm very much looking forward to the keynote, as I've wanted to do
a keynote with Erik for a long time. I really enjoy giving joint
keynotes, and have managed to now with Neal, Dan, and Jim. Erik and I just haven't found
a chance to team up together before. My final talk is a solo. I'll be talking about patterns in
enterprise software. With this talk about the role I see patterns
playing in software design and talking about some of the more
important patterns that I've written up.
|
| CheaperTalentHypothesis |
design |
8 February 2008 |
Reactions |
|
One of the commonly accepted beliefs in the software world is
that talented programmers are more productive. Since we
CannotMeasureProductivity this is a belief that cannot be
proven, but it seems reasonable. After all just about every human
endeavor shows some people better than others, often markedly
so. It's also commonly observed by programmers themselves, although
it always seems to be remarked on by those who consider themselves to be
in the better talented category. Naturally better programmers cost more, either as full-time hires
or in contracting. But the interesting question is, despite this,
are more expensive programmers actually cheaper? On the face of it, this seems a silly question. How can a more
expensive resource end up being cheaper? The trick, as it is so
often, is to think about the broader picture of cost and value. Although the technorati generally agree that talented programmers
are more productive than the average, the impossibility of
measurement means they cannot come up with an actual figure. So
let's invent one for argument sake: 2. If you can find a factor-2
talented programmer for less than twice of the salary of an average
programmer - then that programmer ends up being cheaper. To state this more generally: If the cost premium for
a more productive developer is less than the higher productivity of
that developer, then it's cheaper to hire the more expensive
developer. The cheaper talent hypothesis is that the cost
premium is indeed less, and thus it's cheaper to hire more
productive developers even if they are more expensive. In case anyone hasn't noticed this hypothesis is a key part of
our philosophy at ThoughtWorks and is one of the main reasons why I
ended up switching from an independent consultant to join. We
believe we actually end up cheaper for our clients, even though our
rates were higher. Of course, we do have difficulty persuading many
clients that this is true - that lack of objective productivity
measures strikes again. I still remember a meeting with one
prospective client complaining about how our rates were higher than
a company who had made a previous, failed, attempt at the system we
were bidding on. We had to politely point out that paying less rates
for a project that delivered no value was hardly a financially
prudent strategy. There are some notable consequences to the the cheaper talent
hypothesis. Most notably is one that it actually follows a positive
scaling effect - the bigger the team the bigger the benefits of
cheaper talent. Let's assume we actually have put together a team of
ten talented developers to run a project in some alternative
universe where we have actually measures that they are twice as
productive as the average - and thus do cost exactly twice as much
to hire. In this case you might naturally assume that a rival team
of average programmers would be a team of twenty. The trouble is that that assumption assumes productivity scales
linearly with team size, which again observation indicates isn't the
case. Software development depends very much on communication
between team members. The biggest issue on software teams is making
sure everyone understands what everyone else is doing. As a result
productivity scales a good bit less than linearly with team size. As
usual we have no clear measure, but I'm inclined to guess at it
being closer to the square root. If we use my evidence-free guess as
the basis then to get double the productivity we need to quadruple
the team size. So our average talent team needs to have forty people
to match our ten talented people - at which point it costs twice as much. Another factor that plays a role here is time-to-market. Let's
assume two teams of four people, one talented and one average. To
stack the deck of our argument against our talented team, discount
the previous paragraphs, and assume the talented team is only twice
as productive as the average team. If the talented team charges
twice as much then can we assume that it doesn't matter financially
which team we pick? I'm afraid the talented team wins again. They'll complete the
project in half of the time of the average team, which means that
the customer will start yielding value from the delivered software
earlier. This earlier value, compounded by the time value of
money, represents a financial gain for picking the talented team,
even thought their cost per output is the same. Agile development further accelerates this effect. A talented
team has a faster cycle time than an average team. This allows the
full team to explore options faster: building, evaluating,
optimizing. This accelerates producing better software, thus
generating higher value. This compounds the time-to-market
effect. (And it's natural to assume that a talented team is more
likely to produce better software in any case.) Faster cycle time leads to a better external product, but perhaps
the greatest contribution a talented team can make is to produce
software with greater internal quality. It strikes to me that the
productivity difference between a talented programmer and an average
programmer is probably less than the productivity difference
between a good code-base and an average code-base. Since talented
programmer tend to produce good code-bases, this implies that the
productivity advantages compound over time due to internal quality too. All this sounds, at least to me, like a highly compelling
argument. It's also one that's widely accepted (at least by
programmers who consider themselves talented). But it's far off
being accepted by the software industry as a whole. We can tell this
because the premium for talented developers (in terms of
salary/contracting fees) is less than the
productivity difference. Probably the major reason for this the
inability to objectively measure productivity. A hirer cannot have
objective proof that a more expensive programmer is actually more
productive. Only the higher cost is objective. As a result a hirer
has to match a subjective judgment of higher value against an objective higher
cost. Many hirers, even if they believe the talented programmer is
worthwhile personally, isn't prepared to justify the full higher
cost to managers, HR, and purchasing. This effect is compounded by the difficulty in making even a
subjective assessment. At ThoughtWorks we rely on peer assessment -
developers abilities are assessed by fellow team members. The result
is hardly pinpoint precision, but it's the best anyone can do. Which all points out that hiring and retaining talented
programmers is hard work. Hiring and assessment is hard work. You
have to deal with people with very individual desires, which are
even more important to track as they are effectively underpaid. So
a hirer is faced with certain extra work and higher costs versus
only a judgment call for higher productivity. So I understand the situation but don't accept it. I believe that
if the software industry is to fulfill its potential it needs to
recognize the cheaper talent hypothesis and close the gap between
high productivity and higher compensation.
|
| PreferDesignSkills |
agile |
17 January 2008 |
Reactions |
|
Imagine a hiring situation. There's two candidates both with a few
years of experience. In the blue corner we have someone with good
broad design skills in the style of design that you favor (in my case
that would be things like DRY, judicious use of patterns, TDD,
communicative code etc, but the actual list isn't important - just
that it's what you favor). However she knows nothing of the particular
platform technology that you're using. In the red corner we have
someone who has little knowledge (or interest) in those issues, but
knows your platform really well - edge cases in the language, what
libraries are available, fingers move naturally over the
tools. Assume all else about them is equal (which it never is except for
thought experiments like this) and that your team doesn't have any
gaping holes that this candidate might fill. Which one would you
prefer? My answer is simple, I'd take the one in the with broad design
skills. I've always held the view that a good programmer should be
able to pick up a new platform relatively quickly. Learning basic
design aesthetics is both harder and carries over better to new
platforms. Good design practices that matter in Java are equally
valuable in .NET. Not being familiar with the platform does slow you
down (how do I get a literal class name in C# again?), but producing
well designed code is what really makes a difference. Broad design skills aren't completely portable. Java and .NET are
mostly equivalent as languages - moving to Ruby, however, changes
more. Moving to a significantly different beast, like functional
languages, is a bigger shift. In any case, you can't just blindly
replicate all design habits in a new environment. But if you're aware
of the new world, an awful lot does carry over. We've seen this principle prove itself at ThoughtWorks. In our
early days with Java, we found the skills experienced developers had
learned in Forte gave us excellent instincts for working with Java. We
moved away early from the EJB-dominant thinking, and I think it was
experience with other platforms that guided us. We saw it even more
strongly with .NET. Time and again we saw that good developers with a
Java background were rapidly more effective than those with a longer
.NET or Microsoft background who lacked those skills. The difference
was visible in weeks, not months (and sometimes days). At the moment we see this shift most notably in Ruby. We've had
quite the run of Ruby projects this year, and often we turn to people
with long experience in curly-brace languages to fill the need. Again
we've seen the value that broad design skills gives us. It's not always a sure thing. I have seen cases where someone
experienced in another platform just doesn't desire to get in and
learn the new one. Desire to learn is a necessary component here - I'd take
the single platform specialist if he wanted to learn broad design and
the broad designer didn't want to learn the new platform. It's also
essential to have someone on the team who knows the platform well. I'd say most people at ThoughtWorks prefer design skills over
platform knowledge. Many clients don't share that point of view -
which can lead to some difficult pragmatic and ethical choices. What happens if you have someone you want to bring onto the team
with strong design skills and no platform background - yet the client
insists on at least two years experience on the platform. In your
professional judgment, the broad candidate is going to be more
productive than anyone else available. You need to be honest with your
client, but on the other hand he is paying you for your professional
judgment. Does this change if the client has given you responsibility
for delivery of the project? For us these questions are more charged because there is a
financial interest involved. If we add a ThoughtWorker to a team, then
we bill for that person. If a client hires a platform specialist
contractor, we don't get that income. For many people this is a
crucial fact in the situation, yet I expect our project managers are
wise enough to know that the risk of adding the wrong person is much
more important than one billable income. Consider another case where you're open with the client and the
client demands a reduced rate for the broad design person due to her
lack of platform knowledge as she'll be learning on the job. You're
sure that she, despite that lack, will be more productive than the
competing platform expert due to those design skills. Should you
accept a reduced rate? It's the nature of our, and most other, professions that
you learn on the job. A platform specialist also has to learn broad
design skills if he's going to produce maintainable code. Here
it's important to remember that not just is it usually harder to learn
design than platforms, it's also less certain. Given a motivated
broad-designer, I can be pretty sure she'll pick up a platform in
time. But there's no guarantee the other way around. Some people are
good at learning details of a platform, but never figure out how to
write clear code. I've talked here about broad design skills - and I do believe this
makes a difference on the technical axis. But there are other
dimensions of broadness too. Most risk in software projects lies in
the communication between businesspeople and programmers, so a
candidate who can communicate well with users brings a great deal to a
team. A similar issue is knowledge of the problem domain. Often clients
want people who already know their business, yet are surprised when
someone rapidly gains enough understanding to be useful. I've long
held that it's the ability to collaborate with others which is central
here. By collaborating with a domain expert, or a platform expert,
someone with broad skills can be become effective very
quickly. Knowledge of other domains often introduces surprising
insights into a project and similarities often crop up in sup-rising
places. It's remarkable how often things like core accounting patterns
crop up in places that don't look like accounting on the surface. In
the end it's the ability to work with others, coupled with being a
fast learner, that is the critical skill. I'm not dismissing deep platform knowledge here. In an ideal world
every team member would be excellent broad programmers with several
years platform experience, good familiarity with the problem domain,
and written similar systems at least twice before. But we all know how
far our world is from that ideal. You need some platform knowledge on
a team, and if it were a gap I would reach for the platform
specialist to fill it. But that doesn't alter my default position to
prefer broad design skills most of the time.
|
| RepositoryBasedCode |
design |
14 January 2008 |
Reactions |
|
An alternative to SourceBasedCode is the idea that the
core definition of a system should be held in a model and edited
through projections. To talk about this style of environment I find it handy to
think in terms of multiple representations of the system: - editable representation: what you edit in order to change the
system.
- storage representation: the persistent record of the system
definition.
- executable representation: what is executed to make the system run
- the executable.
- abstract representation: used to manipulate and reason about
system definition.
- visualization representation: a non-editable view of the system
definition.
A source based system combines the editable and storage
representations in the source file. It executes the source by
transforming the source into an executable representation either in
one observable step (interpretation) or multiple steps via a
compiler. In order to do this it usually transforms the source into
an abstract representation as an intermediate step, but this
abstract representation is transitory and only around during
compilation. The source is seen as the core definition of the
system.  With a repository based system the abstract representation is the
is core definition of the system. A tool manipulates the abstract
representation and projects multiple editable representations for
the programmer to change the definition of the system. The tool
persists the abstract representation in a storage representation,
but this is entirely separated from any of the editable
representations that it projects. The relationship to the executable
representation is pretty much the same - the executable is produced
through a series of transformations from the abstract
representation.  An important difference between repository and
source based environments is the split between persistent storage
and editing. Repositories can choose any persistence mechanism that
they choose, while source systems need to have some universal
storage mechanism - which is why they are almost always text files. The abstract representation may be edited through multiple
projections, each projection can show a limited amount of the total
information which isn't tied to the actual structure of the abstract
representation. Repository systems thus usually show a wider range
of editing environments - including graphical and tabular structures
- rather than just a textual form. Sophisticated source based IDEs also show multiple projections -
for instance a side pane showing a list of methods for a class with
graphical annotations to indicate their
AccessModifiers. However these projections are usually
very much secondary to a source editor, and often the projections
can't be edited directly - you have to change the source and see the
projection update. Such PostIntelliJ IDEs do this by creating an abstract
representation when they load the source files (which is why they
can take a while to start up). They also use the abstract
representation to do perform lots of other code-assistance features
such as contextual code completion and refactoring. A significant pragmatic problem with repository based systems is
the fact that there is no generally accepted way format for the
storage representation. The fact that programmer-readable text is
the universal choice for source files means that a whole slew of
tools can be built to process them: editors, source-code control,
difference visualizers etc. Repositories have to do all this
themselves, which is often why these things are often lacking. In
particular many repository based environments suffer greatly because
they don't have a decent configuration control system, which makes
it much harder for multiple people to collaborate on the same system
definition. This is a big contrast to source based environments that
have a plethora of source code control systems to do this task. Repository based systems are closely connected with Model-Driven
Development (MDD), although I don't think the two are entirely
synonyms. In an MDD context the abstract representation is usually
referred to as the model. Certainly almost all MDD tools are
repository based, but many all repository based tools, eg
Microsoft Access, would not consider themselves to be MDD. (I first explored this way of looking at environments in my essay
on Language
Workbenches. I've described it here because I think the notion
of repository based environments is broader than just in Language
Workbenches.)
|
| TestCancer |
design |
6 December 2007 |
Reactions |
|
As my career has turned into full-time authorship, I often worry
about distancing myself from the realities of day-to-day software
development. I've seen other well-known figures lose contact with
reality, and I fear the same fate. My greatest source of resistance
to this is ThoughtWorks, which acts as a regular dose of reality to
keep my feet on the ground. ThoughtWorks also acts as a source of ideas from the field, and I
enjoy writing about useful things that my colleagues have discovered
and developed. Usually these are helpful ideas, that I hope that some
of my readers will be able to use. My topic today isn't such a
pleasant topic. It's a problem and one that we don't have an answer
for. The scenario runs like this. We carry out a project for a client
and hand over a shiny new piece of software. As is our habit these
days, we also hand over a bevy of automated tests for this software
(typically there are as many lines of code of tests as there are of
functional code). These tests are usually a mix of unit tests and
broader ranging functional and acceptance tests. Either way the tests
act as an active description of what the software does and a bug
detector to quickly find problems as we evolve the software. We
treasure these tests, they are a key to our success in building
software systems. Some months later the happy customer calls us back to do some
further work on the software, adding new features and
capabilities. We come in, keen to work on a code base that may have
faults - but at least are our faults. Then we make an
unpleasant discovery. The tests no longer run. Sometimes the tests are excluded from the build scripts, and
haven't been run in months. Sometimes the "tests" are run, but a
good proportion of them are commented out. Either way our precious
tests are afflicted with a nasty cancer that is time-consuming and
frustrating to eradicate. We ask what happened and are told things like "we made a change
and some tests broke, so we removed the tests". You can look at this
as our failing - we haven't managed to fully teach the client
teams about the value of the tests. We need to do more to pass on
that failing tests need to be investigated, not simply ignored. But
whatever anyone says, we've discovered that cancer of the tests is a
common disease. We don't think that the fact that Test Cancer appears is a reason
against writing tests. Even if a particularly virulant strain wipes
them all out the day after we leave, we still got value from them
while we were building the system. And tests don't always get
cancer. We recently spoke to a developer who had become a convert to
TDD after maintaining a system we'd handed over a few years ago. The
tests made our code much easier to work with than code that other
firms had added later.
|
| BookCode |
writing |
4 December 2007 |
Reactions |
|
I don't not write much production code these days, but I still
spend quite a few hours writing code. This code is a particular
form of code, meant for explaining ideas in books. Book code isn't
quite like real code, there are some different forces to consider
when writing it.
One question is the choice of language. I need to write code in a
language that as many of my readers can read and follow. I try to
write about ideas that are platform independent, but I need code to
be precise. So I need to pick some widely readable language that
people can follow. In my early days the two largest OO languages were Smalltalk and
C++. Both had faults, Smalltalk was too weird for non-smalltalkers
and C++ was too full of sharp edges to get right. Java was a godsend
for me. Anyone with a C/C++ background could read it. Even most
smalltalkers could hold their noses and understand what I was
coding. That's why the refactoring book was in Java. Later on .NET appeared. The nice thing here was the C# was mostly
the same as Java, so I could use the two pretty interchangeably. I
liked to use both to reinforce that the ideas I was writing about
were useful in either case. That situation is getting more difficult these days, particularly
with writing about DomainSpecificLanguages. Java and C#
are diverging, and some things I want to illustrate require features
that neither of them have. I do much of my personal programming in
Ruby, which is very well suited to DSLs, so I will use Ruby as my
first choice for this situation. But other languages have
contributions too. I need to balance illustrating various language
capabilities for DSLs against letting the book become a hodgepodge
of language tidbits. Another issue with book code is to beware of using obscure
features of the language, where obscure means for my general reader
rather than even someone fluent in the language I'm using. A good
example of this is that when I write examples using Ruby, I've often
shied away from using CollectionClosureMethods, even
though I use them heavily in my own Ruby code. This is because I
consider that programmers from a curly-brace background will find it
harder to understand those kinds of expressions. So I sacrifice
fluent Ruby in order to reach those readers. Again this is much harder for a DSL book. Internal DSLs tend to
rely on abusing the native syntax in order to get readability. Much
of this abuse involves quirky corners of the language. Again I have
to balance showing readable DSL code against wallowing in quirk.
|
|
|