The Legacy Replacement Caveat

Yesterday I wrote about the difficulty of replacing existing systems, the challenges of meshing waterfall and agile (with reference to a currently running project) and proposed some options that could help move work forward. There is, though, one big caveat.

Some legacy systems are purely “of the moment” – they process a transaction and then, apart from for reporting or audit reasons, forget about it and move on to the next transaction.

But some, perhaps the majority, need to keep hold of that transaction and carry out actions far into the future. For instance:

– A student loan survives over 30 years (unless paid back early). The system needs to know the policy conditions under which that loan was made (interest rate, repayment terms, amount paid back to date, balance outstanding etc)

– Payments made to a farmer under Environmental Stewardship rules can extend up to a decade – the system retains what work has been agreed, how much will be paid (and when) and what the inspection regime looks like

In the latter case, the system that handles these payments (originally for Defra, then for Natural England and now, I believe, for the RPA) is called Genesis. It had a troubled existence but as of 2008 was working very well. The rules for the schemes that the system supports are set every 7 years by the EU; they are complicated and whilst there is early sight of the kind of changes that will be made, the final rules, and the precise implementation of them, only become clear close to the launch date.

Some years ago, in the run up to the next 7 year review, GDS took on the task, working with the RPA, of replacing Genesis by bundling it with the other (far larger in aggregate, but simpler in rules and shorter in duration) payments made by the RPA. As a result, Defra took the costs of running Genesis out of its budget from the new launch date (again, set by the EU and planned years in advance). Those with a long memory will remember how the launch of the RPA schemes, in the mid-2000s, went horribly wrong with many delays and a large fine levied by the EU on the UK.

The trouble was, the plan was to provide for the new rules. Not the old ones. An agreement could be made with a farmer a week before the new rules were in place, and that agreement would survive for 10 years – and so the new system would have to inherit the old agreements and keep paying. Well, new agreements could have been stopped ahead of a transition to the new system you might say. And, sure, that’s right – but an agreement made a year before would still have 9 years to go; one made 2 years before would have 8 years to go. On being told about this, GDS stripped out the Genesis functionality from the scope of the new system, and so Genesis continues to run, processing new agreements, also with 10 year lives … and one day it will have to be replaced, by which time it will be knocking on 20 years old.

Those with good memories will also know that the new system also had its troubles, with many of the vaunted improvements not working, payments delayed, and manual processes put in place to compensate. And, of course, Defra is carrying the running costs of the old system as well as the new one, and not getting quite the anticipated benefits.

IT is hard. Always has been. It’s just that the stakes are often higher now.

When replacing legacy systems where the transactions have a long life, sometimes there is a pure data migration (as there might be, say, for people arriving in the UK where what’s important is the data describing the route that they took, their personal details and any observations – all of which could be moved from an old system to a new system and read by that new system, even if it collected additional data or carried out differnt processing from the old system). But sometimes, as described above, there’s a need for the new system to inherit historic transactions – not just the data, but the rules and the process(es) by which those transactions are administered.

My sense is that this is the one of the two main reasons why legacy systems survive (the other, by the by, is the tangled, even Gordian, knot of data exchanges, interfaces and connections to other systems).

There are still options, but none are easy:

– Can the new system be made flexible enough to handle old rules and new rules, without compromising the benefits that would accure from having a completely new system and new processes?

– Can the transactions be migrated and adjusted to reflect the new rules, without breaching legal (or other) obligations?

– Can the old system be maintained, and held static, with the portfolio of transactions it contains run down, perhaps with an acceleration from making new agreements with individuals or businesses under the new rules? This might involve “buying” people out of the old contracts, a little like those who choose to swap their defined benefit pension for a defined contribution deal, in return for a lump sum.

– Can a new version of the old system be created, in a modern way, that will allow it to run much more cheaply, perhaps on modern infrastructure, but also with modern code? This could help shave costs from the original system and keep it alive long enough for a safe transition to happen.

Some of these will work; some won’t. The important thing is to be eyes open about what you are trying to replace and recognise that when you reach from the front end into the back end, things get much harder and you forget that at your peril. Put another way, Discovery is not just about how it should be, but about how it was and how it is … so you can be sure you’re not missing anything.

Agile Meets Waterfall

This morning I’ve been looking at the release schedule for a major project launching next year. Chances are that you will be affected by it somehow. No names disclosed though. It’s a replacement for an existing capability and will use an off the shelf, quasi-cloud product (that is, software that is already in use by others and so can, in theory, be configured but that, in practice, will need some development to make it do what is really needed).

The top level project plan looks like this

Spring 2019Discovery
Summer 2019Design and Build
Spring 2020Testing
Summer 2020Go-live
Spring 2021Transition completed
ThereafterContinuous improvement

Plainly the team want to be agile. They want to iterate, adding features and capabilities and seeing how they land before improving them – going round the discovery / alpha / beta / live process repeatedly. But, at the same time, they have a major existing system to replace and they can’t replace just part of that.

They also, I suspect, know that it’s going to be harder than everyone expects, which is why they’re sticking to those nebulous seasonal timeframes rather than talking about particular months, let alone actual dates. The rhythmic cadence of Spring/Summer milestones perhaps suggests an entirely made up plan.

Ever since I first joined the Inland Revenue and heard a team saying that they “hoped to be up to speed by the Autumn” I’ve laughed whenever seasons are mentioned in the context of delivery. It’s as if there are special watches issued with 4 zones marked on them – one for each season.

What do you do when meshing the need to replace an existing widely used system that is no longer doing what it needs to do and that needs radical upgrades, with a new capability that can’t be introduced a bit at a time. This is not a start up launching a new product into the market. They’re not Monzo who were able to start with a pre-paid debit card before moving into current accounts, credit cards etc.

These kinds of projects present the real challenge in government, and in any large organisation:

How do you replace your ageing systems with new ones whilst not losing (too much) capability, keeping everyone happy and not making an enormous mistake?

At the same time, how do you build a big system with lots of capability without it going massively over-budget and falling months or years behind plan?

There are some options to consider:

  • Is a NewCo viable? Can a new capability be set up in parallel with the old, and customers, users or businesses migrated to that new capability, recognising that it doesn’t do everything. This is ideal MVP territory – how much do we need to have to satisfy a given group of customers?Ideally have less complicated needs than the full set of customers, and we can shore up the NewCo with some manual or semi-automated processes, or even by reaching into the old system
  • Can we connect old and new together and gradually take components away from the old system, building them in the new world? This might particularly work where a new front end can be built that presents data far more clearly and adds in data from other sources. It might require some re-engineering of the old system which perhaps isn’t used to presenting data via APIs and has never been described as loosely coupled.
  • Is there standalone capability that will make for a better experience, perhaps using data from the old system (which can be extracted in lots of different ways from cumbersome through to smooth), with that new capability gradually expanded?

None are easy, of course, but they are all likely better than a big bang or the risk of a lengthy project that just gets longer and more expensive – especially as scope increases with time (people who have time to think about what they want will think of more things, and industry will evolve around them giving them more things to think about).

There are, importantly, two fundamental questions underpinning all of these options (as well as any others you might come up with):

1. What do we do today that we don’t need to do tomorrow. Government systems are full of approaches to edge cases. They are sedimentary in nature – built over the years (and decades) with layer after layer of policy change. Today’s system, and all of the capability it has, is not the same as what is needed. This allows a jackhammer to be taken to the sedimentary layers so that the new capability isn’t trying to replicate everything that went before.

2. What do we need to do tomorrow that we don’t do today. This gets the policy arm the chance to shape new capabilities, including simpler policies that could have fewer edge cases, but still cater for everyone. It will also allow new thinking about what the right way to do something is, considering everything that has gone before but also what everyone else is doing (in other industries, other government departments and other countries).

Asking those questions, and working the answers really hard, will help ensure that the solution reflects the true, current need, and will also help get everyone off the page of “we need to replace what we have” when, in reality, that’s the last thing that anyone should be trying to do.

There is, though, one giant caveat to all of this which I will take a look at tomorrow.

The Race To 5G

Jonathan Margolis, the FT’s erudite technology editor, claimed in the How to Spend It magazine this weekend that “one of the surprises about the iPhone 11 … was that it does not have 5G” and then went on to laud the recently launched Samsung Galaxy S10 5G.

I found that an odd view. Indeed I’m surprised that he thought that it was a surprise. Apple has rarely added network features until they were widely available. That’s both because there needs to be a viable market for a service (if I turn on my expensive 5G phone and don’t see, immediately, a 5G symbol in the top left or right, I will surely feel let down and perhaps worse) and also because Apple will want to be sure components that provide the service are robust, reliable and ready for delivery to as many as 200m phone buyers a year.

Every few years we go through the xG hype cycle. I was closely involved in the 4G rollout some years ago and, despite best efforts, reality was far behind PR for many of the network operators (arguably only EE got delivery done in line with their PR and that was because of some clever reuse of spectrum).

There are plenty of 5G challenges ahead starting with a lack of agreed standards. This, in turn, means risk in buying components which may not be entirely compatible with the final versions. It also means that components are not yet optimised – they are much larger than their 4G equivalents, run much hotter, and take up valuable space that is needed for batteries (which, given those same components will be shorter). And, besides, we know that heat and batteries don’t mix.

There are other challenges ahead for 5G, particularly for operators. Network equipment manufacturing is in the hands of only a few suppliers, installation is a lengthy and cumbersome visit involving many 3rd parties and a dozen visits to each mast, the spectrum available is at a far greater mix of frequencies than previous rollouts (which means modelling and planning are more complicated and that there will be a need for infill masts, especially in highly populated areas) and so on.

The truth is that 5G is still some years away, at a general population level. There will be exceptions – specific use cases – of course. But for most of us, it is 2-3 years away.

So, no, it’s no surprise that Apple hasn’t shipped 5G compatible iPhones. It’s also no surprise that other companies have – some like to be first, some like to claim “me as well”, some want to trial new things to differentiate otherwise lacklustre offerings, and some want to get a sense of how the technology works in the field. This is what makes a market. The early adopters adopt. Some wait a while. Some wait much longer.

Cornish Lithium

A while ago I looked at the ingredients the UK has to support a successful EV design and build industry. I noted that whilst we were not Australia or Chile, the largest sources of lithium for batteries, we did have some in Cornwall.

I read recently that Cornish Lithium, the aptly named company with mining rights over much of the county, has recently raised £1.4m so that it can begin drilling at test sites. Doesn’t sound like a lot of money but doubtless if they find enough, the next funding round will be bigger and will let them explore more. This comes on top of previous funding rounds and also some government grants.

Lithium production may be profitable, depending on production at the time. Battery making, so far, isn’t, especially since China (the largest current market for EVs) cut subsidies. Like many industries, there will be consolidation and it will become a scale game. Local production, done ethically, could be a valuable addition for the right brand though. It will take more than Lithium of course – there’s still cobalt and various other rare earth elements needed.

I wish Cornish Lithium good luck at finding (first) enough and then getting sufficient funding to extract it safely and ethically. We don’t want this to turn into Sirius Minerals part 2.

Diagnostic Parsimony. Or Not.

Occam’s razor says, roughly anyway, the simplest solution is likely the correct one. Hickam’s dictum says, roughly, and particularly in the medical world, don’t simplify too quickly as there may be multiple causes of your problem.

We have a tendency to be overwhelmed by the latter – so many things to look at, understand and do – and so reach quickly for the former: be agile. Go to cloud. Adopt product X. Do a reorganisation.

The real trick is to apply both, teasing out what the main contributors are, and then applying the solution that makes most sense and that will give the best return for effort expended.

The big problem comes when Occam’s Razor and the oxymoronic phrase “quick wins” are applied together. There are few simple solutions, they’re not quick and there isn’t much to win. If there was, they’d already have been done.

Ethics, Old Software and Negative Interest Rates

From today’s FT Letters:

In a world where interest rates had, for centuries, been positive, it’s not hard to see why a programmer would put some validation into code to check for a positive number. Even now when I read about “fat finger” errors where a trader mistakenly buys or sells a number of shares with several more zeros than expected, I wonder why there isn’t more validation (or some secondary control that routes unusual transactions to a second person for checking). BombMoscow might, of course, have needed several levels of such controls, whether a parameter or hard coded.

September Summary

I was away for the first few days of September so posted some pictures of what I’ve come to call Deergital Transformation, including this one:

Male fallow deer, aka bucks, a few days after losing their antlers

For much of the rest of the month I looked at the struggle to deliver projects, particularly ones that we sometimes mislabel as transformational, and how we might think about those in different ways:

  • We tend to approach projects as if they are always going to be successful. We go all in, often on giant projects. And yet real world experience in films (2% of films made get to the cinema and only 1/3 of those are profitable)
  • Similarly, Venture Capital companies know that they are going to kiss a lot of frogs before they find their prince or princess. They back new companies in rounds – seed, series A, series B etc – putting in more money as the principles are proven and the company moves from concept to demo to beta to live and to scale. Bad bets are starved of funds, or “pivoted” where the team is backed to do something different.
  • We, all of us, are quick to suggest numbers – a project will cost £100m, or it will take 48 months, or it will save £1bn – but we are rarely open about the assumptions, and, yes, the pure and simple Wild Assed Guesses. In short, all numbers are made up, treat them with caution unless the rationale is published.
  • We all like to set targets, but we don’t always think about the things that have to be done to achieve that goal. By 2040 “we will climb Everest” is fine as an aim, but the extraordinary preparatory work to achieve it needs to be laid out, to avoid the “hockey stick” problem where you get close to the date when you expected to realise the aim, only to find there’s not enough time left. As a regular half and full marathon runner, I know that if I haven’t put the time in before the race, it’s going to hurt and I’m going to let myself down.
  • Replacing legacy systems is hard. The typical transformational project when we take what we have had for the last 20+ years and replace it with something new, and add lots more functionality (to catch up with all of the things that we haven’t been able to do for the last couple of decades) is fraught with risk and rarely pays off. The typical agile model of MVP and rapid iteration doesn’t always align with the policy aspiration, or what the users want, because, on day one, they get less than they have today. New models are needed though, really, they’re old models.

October has started on much the same path, though let’s hope that the real storms seen at the end of last month have gone and that the only October storms are of the digital kind.

Storm over Devon, September 28th

Withered / Weathered Technology

Whilst Shigeru Miyamoto, the public face of Nintendo, is rightly regarded as the leading light of the video game industry, there is another, unsung, hero, also of Nintendo: Gunpei Yokoi.

He pioneered what we loosely translate as “lateral thinking with withered (or possibly weathered) technology” – taking electronic components (be they chips, LCD screens or whatever) that were no longer leading edge and were, in fact, far from that position, and using them to create affordable, mass produced gadgets.

Gunpei Yokoi was behind some extraordinary hits including Game and Watch and then the Game Boy (an 8 bit, black and white, low resolution handheld gaming console released at a time when every other company, including Atari and Sega, was already moving to colour, high resolution displays – if you know the story, you know that the Game Boy dominated the market for years; total unit sales of some 120m and 26 million copies of Tetris, alone).

Arguably that very thinking is behind more recent products – perhaps Nintento’s Wii and Apple’s iPod shuffle.

In the modern rush to harness new technology capability – be it blockchain, machine learning and artificial/augmented intelligence, new databases, new coding languages, new techniques, voice recognition etc – we sometimes forget that there are proven technologies and capabilities that work well, are widely understood and that could be delivered at lower risk.

Real delivery in government requires large scale systems that are highly reliable – you’re front page news if it goes down after all – and that do what’s needed.

Is there, then, a case for putting the new and shiny to one side whilst experimenting with it (of course) to assess its potential, but not relying on it to be at the core of your new capability until it’s ready.

The core systems at the heart of government are definitely both withered and weathered; they’ve been there for some decades. They need to be replaced, but what should they be replaced with?

Technology at the very leading edge, where skills are in short supply and risks are high, or something further back from the bleeding edge, where there is a large pool of capability, substantial understanding of performance and security, and many existing implementations to compare notes with?

Dealing With Legacy Systems

Legacy, that is, systems that work (and have worked for a couple of decades or longer in many cases) both do the lion’s share of the transactional work in government, but also hold back the realisation of many policy aspirations.

Our legacy systems are so entwined in our overall architecture with dozens (even hundreds) of interfaces and connections, and complicated code bases that few understand, that changes are carefully handled and shephered through a rigorous process whenever work needs to be done. We’ve seen what goes wrong when this isn’t handled with the utmost care. There were the problems at the TSB, or at Natwest, RBS and Tesco Bank, for instance.

The big problem we are facing looks like this:

Our policy teams, and indeed our IT teams, have much bigger aspirations for what could be achieved than the current capability of systems.

We want to replace those systems, but trying to deliver eveything that we can do today, as well as even more capability, is a high risk, big bang strategy. We’ve seen what goes wrong when we try to do everything in a single enormous project, whether that be the Emergency Services Network, the e-Borders programme, Universal Credit etc.

But we also know that the agile, iterative approach, results in us getting very much less than we have today with the promise that we will get more over future releases, though the delivery timetable could stretch out some time with some uncertainty.

The agile approach is an easy sell if you aren’t replacing anything that exists today. Monzo, the challenger bank, for instance, launched with a pre-paid debit card and then worked to add current accounts and other products. It didn’t try and open a full bank on day one – current accounts, debit cards, credit cards, loans, mortgages etc would have taken years to deliver, absorbed a fortune and delayed any chance to test the product in the market.

It’s not, then, either/or but somehow both/and. How do we deliver more policy capability whilst replacing some of what we have and do so at a risk that is low enough (or manageable enough) to make for a good chance of success?

Here’s a slide that I put up at a conference in 2000 that looked at some ways that we might achieve that. I think there’s something here still – where we build (at the left hand end), some thin horizontal layers that hide the complexity of government … and on the right hand side we build some narrow, top to bottom capabilities and gradually build those out.

It’s certainly not easy. But it’s a way ahead.

Why Do Change Projects Fail?

One aphorism has it that culture eats strategy for breakfast. If you don’t get the people across your organisation on side, your change project will fail, no matter how much you throw at it.

When I’ve reviewed major change projects during implement it’s typically hard to find anyone who isn’t on board and committed to making it happen.

When I’ve reviewed other projects, after the fact, when the dust is still settling but failure is clear, it’s often hard to find anyone who didn’t think it was doomed to failure.

Why don’t those sane people speak up before the project is finalised, or as its underway?

This could be because of something called “preference falsification”, a reluctance to contradict your peer group and call out the problem. If everyone around you is on board, then being the odd one out is difficult. Of course, if everyone around you thinks the same as you, and is reluctant to call it out, then you all end up agreeing that it will fail but unwilling to say so.

If one person is honest, the dam can break. This, really, is the theory behind The Emperor’s New Clothes.

Establishing an environment where ideas are exchanged freely, criticism from inside and outside reflected on and addressed directly (whether that results in a change in approach or not) is diffcult. But it is essential. Not everything works. Not every idea is right. The experience around you in any organisation often knows what will and won’t work and, properly harnessed, is the difference between success and failure.