I’ve been off the conference trail for a while but this week will be presenting a couple of times. I’ve left the slides until it’s too late, as usual, and I’ve spent some of the weekend thinking about the right topic. I’ve opted for “The Central Infrastructure Dilemma”, prompted a little by a slide Tim from Software AG put up this week at an inhouse show we ran at OeE on the Sun/Software AG DIS box.
Tim put up a slide from Fred Brooks – author of Mythical Man Month, essential reading for all those in project management, much less software project management (and a book I first read, I think, in 1992). Amazon has only 4 copies left. The slide – which I roughly cribbed as he talked – went something like this:
The essence of the point is that writing a program is not that hard, but making it into something that is distributable (in the commercial sense, not the architectural one), i.e. can be reused by others, and that is supportable for the long term is, roughly, 3x harder on each axis. So it takes 3x the effort to take a programme and make it integratable and 3x the original effort again to make sure that it is documented, tested and maintainable. That means to do both of those things it is at least 9x harder. For every day of effort you put in to writing a neat bit of code that solves *your* problem, it’s going to take 9 to make it solve other people’s problems the same way (if you are planning to do that at arm’s length, i.e. in a scalable fashion). I guess we all know that intuitively this is true and whether we though it was 5x harder or 10x harder the point is that it’s a significant bit of extra work.
Brooks is the guy that project managed the System/360 project at IBM – the first mainframe operating system. It’s a project that consumed more resources than anything that you care to think of. I’ve heard that Windows 2000 (or was it Longhorn) consumes more resources than NASA spent putting man on the moon and I suspect the origin of that quote was from the IBM of the early 1960s. And, of course, the project was late; just like pretty much every (or, more likely, every IT project since then). The cause of this lateness was perhaps what Brooks called the “second system effect” – i.e. if you do something small and sexy pretty well first time, you have a tendency to include (in your next version) wildly grandiose ideas that are beyond your capability. And that will be your downfall.
“As he designs the first work, frill after frill and embellishment after embellishment occur to him. These get stored away to be used “next time”. Sooner or later the first system is finished, and the architect, with firm confidence and a demonstrated mastery of that class of systems, is ready to build a second system.
This second is the most dangerous system a man ever designs.”
So, how does that get me to central infrastructure (CI)? My thinking runs like this: When we first built a CI component in the UK, the government gateway, the challenge was to design something that worked well for the three early adopter departments and that would continue to work well for later departments. We knew what we were working on – a discreet set of functionality in a “black box” with well-documented APIs to the outside world of the Internet and also to the inside world of government. We built something that extended from the world of “program” – the top left of Brooks’ square – to the world of a documented, integratable, repeatable programming system product, i.e. the bottom right.
As new departments came to join the CI party though, their expectations were different. They wanted it to be “All things to all people”, i.e. to cater for their every need – because each and every one of them was a little different in the way that they looked at things. So previously simple requirements, e.g. “match this postcode” became more involved and had to cater for people, say, who owned 5 properties and therefore would need to match all of them. Requirements, naturally, increase – but the job of CI is to focus on the core, not to be all things to all people. If anything, CI’s job is to be “a few things for most people”. A few weeks ago I posted here some graphs like this one:
The aim was to show that CI is facing a dilemma – does it go for scale and cater for the largest customers in the most robust and resilient way or does it try and remain at the leading edge with innovative new requirements well catered for? Go big and heavy or light and smart? Doing either risks alienating those at each end of the wave so the dilemma inevitably resolves itself by jostling around the middle – trying to grab some innovation at the bottom of the wave and seeking to provide scale and capability at the top end.
There is nothing about this that is only 9x harder … if anything it’s the cube of 3 and is 27 times harder. But the thesis of CI is, and always has been, that as long as the right “few things” are concentrated on for the right “many”, then it is cheaper to do those few things in one place than it would be for government as a whole to do it separately and individually, provided a consistent level of standard is applied. That’s where comparisons get difficult. Those who would build a “program” whilst claiming it is a “system” or a “product” will be expending Z amount of effort when, for maximum reuse and supportability, they’ll need to spend at least 3Z or, more likely 9Z. If the comparisons are done with the figures from the Z camp, then the economics don’t stack up until you have at least 28 users (i.e. one more than 3 cubed).
The dilemma then is really “how to take IT in government forward” – and I mean *any* government here; I think it’s safe to say that all are under pressure to deliver more for less with the ever greater need for accountability and demonstration of success.
Those working in the top left hand box include most government entities. They build code for themselves and only for themselves. All for one and everyone for themselves. Why would they do otherwise? They get control, they get rapid response to changes in requirements and they get what they need. Or, at least, they should. It doesn’t work that way though – because, actually, unless you are working in the bottom right hand box, then every change you make layers more complexity on each bit of code; every change needs more documentation; and so, unless all of the processes and procedures are in place, you’re in the 9x box pretty quickly but, because you didn’t do it right in the first place, you might actually be in an 18x, 27x, 54x or 81x box – you’ll never know: the cost of all that will be hidden in the cost of the changes and the cost of running your organisation. There’s more from Brooks on this very point:
All repairs tend to destroy the structure, to increase the entropy and disorder of the system. Less and less effort is spent on fixing orginal design flaws, more and more is spent on fixing flaws introduced by earlier fixes. As time passes, the system becomes less and less well ordered. Sooner or later the fixing ceases to gain any ground. Each forward step is matched by a backward one.
All this doesn’t mean that I’m an avowed centrist – in architectural terms. It does mean that there is an urgent need to decide up front which bits are going to be reused, which are going to be thrown away after a single iteration and which bits are going to be shared across entities. Then you have a chance to apply the right disciplines up front. I also posted this slide a while ago:
It shows my thinking on the way forward for IT. A few, well thought through, components must reside at the top of the pyramid. These will be “a few things for most people” – they will never maintain the speed of innovation that some want nor will they have the scale that some others want because it’s too big a span to cross. They will, however, be stable, resilient, robust, well documented, well tested and easy to integrate. This will be the true central infrastructure. One copy and only one copy, in one place (one logical place, two physical places).
In the middle of the pyramid there is room for people who truly can build the 9x systems. These are systems that will be shared by many entities – they will be products within government for government’s use. They may be off the shelf, “known” names, but they will be setup in a such a way that government need only configure them, not customise them. They will work rapidly and with minimal overhead, because they have been designed in a 9x way. To make use of these though, government entities will have to align their business processes. They will be able to make some changes to the technology, but to speed the process of upgrades and the realisation of benefits from the technology, they won’t want to be too far from the base version. Otherwise, they are in a space where entropy reigns. This space will become known as “collaborative infrastructure” – systems that are built by entities that link up and co-operate. There are already examples of this in some local authorities and there will be in the NHS. I’m conscious that I now have two CIs but the only good synonyms for collaborative all appear to have “C” as the first letter too.
Finally, at the bottom, we have the “isolated infrastructure” – the bits that truly are required by entities for their day to day business and that have no input or output to the outside world. This is an endangered species. Someone told me the other day that the average bank has 15 processes. The average local authority? Perhaps 700. You get from 700 to 15 by upping the ante on collaborative infrastructure – first internally and then externally. And the sooner you start the latter, the sooner you can make the most of the real opportunity.