The Legacy Challenge

What’s a legacy system? Some would say that it’s a system running on no longer supported components – old versions of software, impossible to upgrade databases, operating systems that were released before the Internet was a thing, or servers that still have Pentiums inside. Another way of looking at it would be when the number of people who know how the system works is fewer than the fingers on one hand and you are concerned that if anyone else retires, you will no longer be able to make any changes to the system.

Those are all true. There are plenty of corporations and government entities running systems that have some or all of those. Much of the backbone of UK government’s technology is based on systems built in the 70s, 80s and 90s. Our tax is collected, our benefits paid and our customs transactions policed by such systems.

There is, though, more to a legacy system than that. It could even be said that a legacy system is one that went live yesterday – because unless you have a plan to invest in your shiny new IT asset, all it will do by itself is decay and rust.

As systems get bigger and more complicated, whole areas of code will be looked at less and less frequently; familiarity with how that code works decreases. Personnel churn, whether in house developers or supplier staff, means that new people have less experience with the code than those who went before. Wholesale code reviews are rare. Code optimisation – revisiting already working code to re-factor it- as a way of teaching new staff how the whole thing works seems a forgotten discipline.

We talk now of technical debt – code that was fine when it was launched but is really holding back development of new capability now. It’s too complicated. It’s hard coded. It has dead ends. It doesn’t interface to new tools. This code can be a day old too.

IT systems are strategic assets, like bridges, tunnels, dams and roads. The internals need to be inspected, cleaned, operated and polished. Sure you can stand back and look at them and say how nice they are and how impressive the construction is. But the day you slow down on your maintenance is the day they become part of your legacy estate.

Put another way, it takes a project to get your system live. But if you think that’s the end of the story, you’ll find that it’s only the beginning. Too many projects move on to the next thing, leaving others to pick up what they’ve left behind … with no budget, no support and no chance.

You can pay now, or you can pay later, but you’re going to pay.

Computer Says No

The FT has a front page story today saying that Ulster Bank is absorbing the cost of negative interest rates (on money it has deposited at the ECB) because its systems can’t handle a minus sign. Doubtless whoever wrote the code, maybe in the 80s, never thought rates would fall below zero.

We had a similar problem at a bank in the 90s when our COBOL based general ledger couldn’t handle the number of zeros in the Turkish lita; we wrote to their central bank and PM to see if they wouldn’t mind looping a couple off so that we could continue to process transactions. History does not record the answer, but I suspect there came none.

Legacy systems were in the news in government IT this week as it was stated that there was no central register of such systems, that they are blocking data sharing and that there’s no plan to move off them. GDS, says Alison Pritchard, the interim leader, will be looking for money in the next spending review to deal with the problem.

This is, of course, an admirable aim. The trouble is, departments have been trying to deal with these systems for two decades – borders, immigration, farm payments, student loans, benefits, PAYE, customs etc all sit on systems coded in the 70s, 80s and early 90s. Legacy aka stuff that works. Just not the way we need it to work now.

Every department can point at one, and sometimes several, attempts to get off these systems … and yet the success rate is poor. Otherwise why would they still be around?

The agile world does not lend itself well to legacy replacement. Few businesses would accept the idea that their fully functional system would be replaced in a year or two with a less functional MVP. What would make the grade? How would everything else be handled? Could you run both in sync?

In the early 2000s a few of us tried to convince departments to adopt an “Egg” model and build a new business inside the existing business – one that was purely internet facing and that would have less capability than the existing systems but that would grow fast. Once someone (business or person) was inside the system, we would support them in that new system, whatever it took – but it would be a one way ticket. We would gradually migrate everyone into that system, adding functionality and moving ever more complicated customers as the capability grew.

It’s a challenging strategy. It would have been easier in the 2000s. Harder now. Much harder. But possible. With commitment. And a lot of planning.

All Numbers Are Made Up

Anyone who has worked with me for even a short while will recall a time when I have prefixed or suffixed a number with “this is made up” and usually followed up with “all numbers are made up.”

This is usually in one of two contexts:

  1. “This project will cost £50m (made up number), what does that mean for us, where will the costs fall, what should we worry about in terms of over-runs, where are our risks?” – the aim is to stimulate debate about the project as a whole and give everyone a number to play around with to help get to a (much) better number later.
  2. “I’ve heard that the consequences of this going wrong could be (made up number) £100m.” In this context I have no idea what the right number is but I want to know what other people think, so I throw a number out and see what other people think on the basis that we need to start somewhere.

My general theory is that all numbers you hear quoted are made up – sometimes with a bit of science, but sometimes they’re purely Wild Assed Guesses. The problem is that few admit to the numbers being made up, and so there’s somehow a belief that the fact that someone has stated a nunber must mean it’s true.

Only a few days ago, there was speculation that the cost of repatriating the Thomas Cook passengers (and, at that, just the UK citizens – no one talks about those from other countries who have to find ways home) would be £600m. It was unsourced and plainly completely made up, but very few dared admit that it was nonsense.

Which brings me to today’s story on Government Notify saving £175m in the next five years (my italics).

We don’t, therefore, know how much it has saved in the last 4 years, though we do know that it has been used to send “more than 500 million messages.”

We do know that it’s supposed to save £35m a year for each of the next five years, so 5 * 35 gives us £175m.

There are essentially three ways to get to that number, and it could be a combination or sum of all of them:

  1. Assume that everyone would have to spend some money to build or buy the equivalent capability to Notify. There are commercial equivalents of course. There are costs to integrate to either solution which, one could argue, are the same and therefore they are left out; but there are also operating costs (the commercial services are often SaaS based so there would be no build costs). That might be (Wild Assed Guess) £25,000 per service and we know that there are 1,200 services using it, so that would be £30m for the services to date (who would, of course, not save further money from here because they are already inside the Notify world, unless there is a cost of operation that they need to pay; we don’t know either way)
  2. There’s an arbitrage cost between low scale users and high scale users where message costs are cheaper for the latter and so bundling together lots of government (to get to 500m messages over 4 years for instance) would result in a unit cost save per message, provided you hit the volumes you specify. That might save (made up number) 4p per text message and so we would get £20m savings for all of the messages sent to date.
  3. There’s a business cost save if, for instance, you implement text message reminders for, say, Doctor’s appointments and you track the before and after attendance rates and see that 25% more people attend at the right time when they’re reminded. That might save (made up number) £250/appointment … and multiply that by the 25% extra and you have a savings figure. In the world of truly Wild Assed Guesses, this could be tens or hundreds of millions, depending on how many GPs, Hospitals and others use the services. But the organisatons with the highest number of users are Cabinet Office, Ministry of Justice and Home Office, followed, oddly by MoD, DfE, DWP and HMCTS. No NHS. So we don’t know.

What’s perhaps odd about the £175m savings is that it is quoted as £35m/year for the next 5 years. That is, it doesn’t increase (or decrease). That suggests that there will be no more users, no more services and no more messages sent (which, at least, is better than reducing any of those). That would be a shame – it’s a well used service, filling a clear need and one would imagine that, given that many services are still not online, and of those that are, may don’t use Notify, that there is a market opportunity.

All told, it means we have no idea what to think about the savings and whether they are real, or entirely made up (note: all numbers are made up).

This is an interesting topic, because using my patented eDt Time Machine (TM), I can go back to 2002/3 and look at the case we put together for why we should build a notification engine (we called it “Notifications” – clever, hey?) and we looked at (1) and (2) for our own case and (3) to help the services who might adopt it figure out what the benefits for them would be. We worked with a partner to do the heavy lifting and integrated it with our existing capabilities – HMRC were amongst the first users (sending text messages to say that your tax return had been received for instance).

The outloud thinking about this led to articles, such as one by Charles Arthur in May 2002, that speculated about exam results being sent by text, and another by John Lettice, and the Guardian also picked it up (it may well have been a slow news week). For the record, it came true in 2009, as I noted right here on this blog. And now, it appears that it’s even more true, although we may never know what the real numbers are.

If it’s a made up number, just say what the assumptions were when you came up with it. Transparency and all that.

Biggest Change Since 1976

When your regular train company emails you to tell you it’s making the biggest timetable change since 1976 in just a couple of months, it’s hard not to be fearful. I use GWR a lot – this is just the last 5 years of travel (and doesn’t take into account that as soon as I could use their app reliably, I switched to QR code tickets):

GWR promise:

Faster, more frequent services and more flexibility. All current train times will change and some new services may not stop at the stations they currently call at

That’s a lot of changes.

Especially in the context of the last timetable change, in May 2019 (sample headline: “British railways are reduced to chaos by a botched timetable change”).

Some would call it potentially transformational. Uhoh.

How Much is IT?

Maybe 15 or 16 years ago I sat in a room with a few people from the e-Delivery team and we tried to figure out how much the total IT spend in central government was. All the departments and agencies (including the NHS) were listed on a board and we plugged in the numbers that we knew about (based on contracts that had been let or spend that we were familiar with). Some we proxied based on “roughly the same size as.”

After a couple of hours work, we came up with a total of about £14bn. That’s an annual spend figure. Of course, some would be capital, and some operating costs, but we didn’t include major projects (which would tend towards capital) so it’s likely that 70-80% of that spend was “keep the lights on”, i.e. servers/hosting, operational support, maintenance changes and refreshes.

That number may be wrong today given 10 years of tighter budget management and significant reductions in the staff counts for many large departments. It might be that the £6.3bn in 2014/15 published in an April 2016 report is now more accurate (total government spend that year was c£745bn). A 2011 report suggests £7.5bn. Much depends on the definition of central government (is the NHS in? MoD? Agencies such as RPA, Natural England etc?) and what’s included in the spend total (steady state versus project, pure IT versus consultancy on IT transformation and change projects).

Maybe our number was wrong, maybe the cost has fallen as departments have shrunk. Or maybe it’s hard to get to the right number.

IT is both “a lot” of money and “not much” – public pensions will be some £160bn this year, health and social care roughly the same, Defence as much as £50bn and Social Security perhaps £125bn.

But how much is the right number?It’s useful to know how much is being spent for at least a couple of reasons

  1. Are we getting more efficient at spending, reducing the cost of keeping the lights on and “getting more, or at least the same, for less”?
  2. Are we pushing the planned 25% (or 33% for the 2020 target) of spend towards SMEs?

It would be more useful to know what the breakdown of spending was, e.g. how much are we spending on

  • Hosting?
  • Legacy system support?
  • Infrastructure refreshes?
  • Application maintenance?
  • Application development?
  • And so on

Knowing those figures, department by department, would let us explore some more interesting topics

  • How much are we spending overall and how does that number sit versus other expenses? And versus private sector companies?
  • How are we doing with the migration to cloud (and the cloud first policy) and how much is there left to do?
  • What are our legacy systems really costing us to host, support and enhance? And when we compare those hosting costs with cloud costs, is there a strong case for making the switch (sooner rather than later)?
  • What is the opportunity available if we close down some legacy systems and replace them with more modern systems (with the aim of reducing costs to host, upgrade and refresh, as well as the future cost of new policy introduction)
  • If we don’t take any action and replace some of our old systems, what kind of costs are we in for over the next 5 and 10 years and does that help frame the debate about the best way ahead?

Linday Smith, aka @Insadly, produces some detailed and useful insight on G-Cloud spend, for instance, that tells us, based on data for April to July 2019 that spend on cloud hosting appears to have fallen from £94m in the same quarter in 2018 to £78m this year (he notes that there are some data anomalies that may make this data not so useful – I’ve commented on the problems with the G-Cloud data before and agree with him that it can be unhelpful).

It’s possible that this is a sign of departments getting smart about their hosting – spinning down machines that are unused, using cloud capacity to deal with peaks and then reverting to a lower base capacity, consolidating environments and using better tools to manage different workloads. It could also be a reflection of seeking lower cost suppliers.

Or it could be a sign that there are fewer new projects starting that are using the cloud from day one (because my overall sense is that the bulk of cloud projects are new, not migrations of existing systems), or that departments are struggling to manage cloud environments and so have experimented and pulled back. Alternatively, it could be that departments are Capex rich and because cloud hosting is an Opex spend, they’re actually buying servers again.

Some broad analysis that showed the trends in spending across departments would improve transparency, highlight areas that need attention, help suppliers figure out where to make product investments and help departmental CIOs figure out where their spending was different from their peers. On the journey away from legacy it would also show where the work needed to be done.

Venture Capital Project Management Model

Projects fail. Big projects, arguably, fail more often and with bigger consequences. The more projects in the hopper, the more failures you will see, in absolute if not percentage terms. Government, by its very nature, has thousands of projects underway at once. Back in the 2000s when I worked on mission critical projects with the good folks at Number 10 and (Sir) Peter Gershon, I think we had 120 or so in the first list we came up with. I would be surprised if the count is any smaller now.

Venture capital (VC) investments fail. Perhaps big investments fail more often. The important differentiator is that VC investments are usually made little and often – projects receive small amounts early on (usually from Angel investors which precede VCs) and then more money is gradually invested at higher valuations, as the company/idea grows and reaches various proof points. The last few years have seen this model strained as huge investments can be put into late stage companies (think WeWork … or maybe not).

This is quite different from how most projects are run. Projects go through lengthy due diligence phases up front, sometimes lasting a year or more (longer still when the project concerns physical infrastructure – railways, bridges, nuclear power plants etc). The output of that DD is a business case – and then the go button is pressed. Procurement is carried out, suppliers are selected, contracts are signed and govenrnment is on the hook, for 5, 10 or even more years.

Agile projects can be different in that contracts are shorter, but the business case generally supposes success (hence “optimism bias” as a key metric – if it goes wrong, then it just needs more money). But they can still carry a momentum with them which means that they carry on long after failure was inevitable.

VC companies are more ruthless. They know, after years of meaurement across the industry, that only one or two of their investments in any given period will count for the vast bulk of their returns. They call this “the hit rate” (Fred Wilson writes brilliantly about this, and many other things). Poor performers are culled early on – they don’t get additional funding. Sometimes the investment is in the team, and they are able to change their business idea (that is, “pivot”) and get to continue, buf often the company is shuttered and the team scatter and move on to new ideas.

This brings a tendency to look for huge winners (or the potential for them) – the VC knows that they need to win big, so they look for ideas and teams who will produce those big returns. If they strike out, well, perhaps 8 out of 10 were going to break even or lose money anyway.

  • Data from Correlation Ventures suggests about half of investments made by VCs fail, and about 4% generate a return of 10x or greater

Is there, then, a case for treating government projects the same way. Perhaps we could back multiple, competing projects in the same space, and fund the ones that were proving the most successful? We would have to change the contracting model and include break clauses (not termination clauses as that, in the current vernacular, implies failure – we know that projects are going to fail, we just don’t know which ones).

Sure, we would “waste” some money doing it this way. But we already do – we think we are wasting it right near the end when the project has consumed all of its budget and has nowhere left to go, but, in reality, we’ve been pouring money into something that wasn’t going to success for months or years beforehand.

We could also copy the way some VCs back teams – that is, find teams who have successfully delivered and work to keep them together, moving them onto the next idea, because it may just be that success breeds success. Teams who have proven capable at £10m of project spend should get to play with £50m. or £100m and £200m. We could rotate new team members in to give them exposure to what success looks like, before splitting successful teams and giving them more to run.

In a VC-style mode:

  • Projects would receive initial funding based on their outline thinking – enough to get them through discovery
  • Senior leaders from unrelated projects would be appointed to the board of the project to help navigate early issues and think around corners
  • Additional money would be released stage by stage, with the size of the investment increasing as the project reached predefined proof points
  • Pivots – changes in approach – would be embraced as providing recognition that there was a different way to achieve the same, or a related, outcome, even if there was some loss of investment

There is an obvious challenge here. Moving away from IT, let’s say we were trying to build bridge. It’s hard to fund that in stages (once you’re past feasibility and construction planning). It’s even harder to pivot – once you’re halfway across the chasm, you can’t change from suspension to cantilever, or from bridge to tunnel. Suppliers and partners to government like to know how big the funding envelope is and how long the project will last so that they can plan resources, notify the markets, invest in new capabilites etc. Departments like to do the same – they have team costs to cover after all. This will require some negotiations across government and industry, some changes to procurement thinking and the establishment of a portfolio where the funding envelope is the portfolio, but there can be transition (of funds, people, suppliers and scope) between projects within the portfolio.

We should’t be afraid of losing money becuase, just as in VC portfolios, not every plan is going to succeed; but we should be afraid to keep losing money if the plan isn’t working.

The current model, even with all the agile changes in the last decade, isn’t working as well as it could. There’s a reason that VC companies manage a portfolio – they know that they have to spread their capital quite wisely. Our project management approach feels more like passive investment in an index, rather than active management of a portfolio. We need to make some changes.

10 year iterations

Opportunities for the transformation of government come along roughly once every 10 years it seems. There have been two significant efforts to drive real transformation in UK government in last two decades:

  • The early 2000s several waves of major IT change initiatives, including NHS IT, border technology, ID cards etc. Electronic government was also targetted, building on the previous administration’s “Government Direct” paper. The original 1997 pronouncement said:
    • “by 2002, 25% of dealings with Government should be capable of being done by the public electronically, that 50% of dealings should be capable of electronic delivery by 2005 and 100% by 2008”
  • 2010 brought the the creation of GDS, a big focus on agile and digital exemaplers along with spend controls. Key pronouncements there included:
    • “Government ICT is vital for the delivery of efficient, cost-effective public services which are responsive to the needs of citizens and businesses.
    • Digitising transactional services will save people and businesses time and money; by making transactions faster, reducing the number of failed transactions and simplifying the end-to-end process.

Now, it could be that big waves of change come with changes of Administration – the Blair government in 1997 and the Coalition in 2010.

Or it could be that large organisations, such as the civil service, can only handle significant change on a decade long cycle – it takes time to define goals, communicate, shed previous work, mobilise new teams, align funding and get into delivery.

At the beginning of the change cycle, there is enthusiasm and excitement, occasionally more heat than light perhaps, but nonetheless, a desire to get things done. As the decade draws to a close, everyone is tired, change is harder and less and less is done. You can perhaps see some of that tiredness in Gerry Gavigan’s “short history of government digital strategies” published in 2012.

As this decade draws t0 a close and the new one begins, we might be ready for a new approach.

One that brings a clear strategy and plan but that also recognises the need to work out how to keep momementum going over a 10 year cycle, avoiding the big bang start and the drawn out slowdown.

One that mobilises teams across government and its industry and third sector partners and that allows for inevitable fatigure, rotation of staff and the funding cycle.

One that sets long term goals, but also makes clear the steps along the way that will be required to meet those goals and that, in so doing, will demonstrate the progress that is being made on a monthly, quarterly and annual cycle.