Getting A Grip: Special Report on the AWS Special Report

Late last week a seemingly comprehensive takedown Amazon, titled “Amazon’s extraordinary grip on British data“, appeared in the Telegraph, written by Harry de Quetteville.

Read quickly it would suggest that Amazon, through perhaps fair and foul means, has secured too great a share of UK Government’s cloud business and that this poses an increasingly systemic risk to digital services and, inevitably, to consumer data.

Read more slowly, the article brings together some old allegations and some truths and joins them together so as get to the point where I ask “ok, so what do you want to do about it”, but it doesn’t suggest any particular action. That’s not to be said that there’s no need for action, just that this isn’t the place to find the argument.

The main points of the Telegraph’s case are seemingly based on “figures leaked” (as far as I know, all of this data is public) to the newspaper:

  • Amazon doesn’t pay tax (figures from 2018 are quoted showing it paid £10m euros on £1.9bn revenues, using offshore (Luxembourg) vehicles. For comparison, the article says, AWS apparently sold £15m of cloud services to HMRC.
  • There is a “revolving door” where senior civil servants move to work for Amazon “within months of overseeing government cloud contracts.” Three people are referenced, Liam Maxwell (former Government deputy CIO and CTO), Norman Driskell (Home Office CDO) and Alex Holmes (DD Cyber at DCMS).
  • Amazon lowballs prices which then spiral … and “even become a bar to medical research.” This is backed up by a beautifully done Amazon smile that says DCLG signed a contract in 2017 estimated at £959,593 that turned out to cost £2,611,563 (an uplift of 172%)
  • There is a government bias towards AWS giving it “an unfair competitive advantage that has deprived British companies of contracts and cost job[s].
  • A neat infographic says “1/3 of government information is stored on AWS (including sensitive biometric details and tax records); 80% of cloud contracts are “won by large firms like AWS
  • Amazon’s “leading position with … departments like the Home Office, DWP, Cabinet Office, NHS Digital and the NCA is also entrenched.
  • Figures obtained by the Sunday Telegraph suggest that AWS has captured more than a third of the UK public sector market with revenues of more than £100m in the last financial year.

Let’s start by setting out the wider context of the cloud market:

  • AWS is a fast growing business, roughly 13% of Amazon’s total sales (as of fiscal Q1 2019). Just 15 years old, it has quickly come to represent the bulk of Amazon’s profits (and is sometimes the only part of Amazon that is in profit – though Amazon would say that they choose not to make the retail business profitable, preferring to reinvest).
  • Microsoft’s Azure is regularly referred to as a smaller, but faster growing business than AWS. Google is smaller still. It’s hard to be sure though – getting like for like comparisons is difficult. AWS’ revenues in Q2 2019 were $7.7bn, and Microsoft’s cloud (which includes Office 365 and other products) had $9.6bn in revenues. AWS’ growth rate was 41%, Azure’s was 73% – both rates are down year on year. Google’s cloud (known as GCP) revenue isn’t broken out separately but is included in a line that includes G-Suite, Google Play and Nest, totalling $5.45bn, up 25%
  • Amazon, as first mover, has built quite the lead with various figures published, including those in the Telegraph article, suggesting it has as much as 50% of the nascent cloud market. Other sources quote Azure at between 22 and 30% and Google at less than 10%%

There’s a almost “by the by” figure quoted that I can’t source, where Lloyd’s of London apparently said that “even a temporary shutdown at a major cloud provider like AWS could wreak almost $20bn in business losses.” The Lloyd’s report I downloaded says:

  • A cyber incident that took a “top three cloud provider” offline in the US for 3-6 days would cost between $6.9bn and $14.7bn (much of which is uninsured, with insured losses running $1.5-2.8bn)

What’s clear from all of the figures is that the cloud market is expanding quickly, that Amazon has seized a large share of that market but is under pressure from growing rivals, and that there is an increasing concentration of workloads deployed to the cloud.

It’s also true that governments generally, but particularly UK government, are a long way from a wholesale move to the cloud with few front line, transactional services, deployed. Most of those services are still stuck in traditional data centres, anchored by legacy systems that are slow to change and that will resist, for years to come, a move to a cloud environmnet. Instead, work will likely be sliced away from them, a little at a time, as new applications are built and the various transformation projects see at least some success.

The Crux

When the move to cloud started, government was still clinging to the idea that its data somehow needed protection beyond that used by banks, supermarkets and retailers. There was a vast industry propping up the IL3 / Restricted classification (where perhaps 75-80% of government data sat, mostly emails asking “what’s for lunch?”). This classification made cloud practically impossible – IL3 data could not sit on the same servers or storage as lower (or higher) classified data, it needed to be in the UK andsecured in data centres that Tom Cruise and the rest of the Mission Impossible team couldn’t get into. Let’s not even get into IL4. And, yes, I recognise that the use of IL3 and IL4 in regards to data isn’t quite right, but it was by far the most used way of referring to that data.

Then, in 2014, after some years of work, government made a relatively sudden, and dramatic, switch. 95% of data was “Official” and could be handled with commercial products and security. A small part was “Official Sensitive” which required additional handling controls, but no change in the technical environment.

And so the public cloud market became a viable option for governments systems – all of them, not just websites and transactional front ends but potentially anything that government did (that didn’t fall into the 5% of things that are secret and above).

Government was relatively slow to recognise this – after all, there was a vast army of people who had been brought up to think about data in terms of the “restricted” classification, and such a seismic change would take time. There are still some departments that insist on a UK presence, but there are many who say “official is official” and anywhere in the UK is fine

It was this, more than anything, that blew the doors off the G-Cloud market. You can see the rise in Lot 1/IaaS cloud spend from April 2014 onwards. That was not just broad awareness of cloud as an option, but the recognition that the old rules no longer applied.

The UK’s small and medium companies had built infrastructures based around the IL3 model. It was more expensive, took longer, and forced them through the formal accreditation model. Few made it through; only those with strong engineering standards and good process discipline and, perhaps, relatively deep pockets. But once “official” came along, much of that work was over the top, driving cost and overhead into the model and it wasn’t enough of a moat to keep the scale players out.

TL;DR

I’ve let contracts worth several hundred million pounds in total and worked with people who have done 5, 10 or 20x that amount. I’ve never met anyone in government who bought something because of a relationship with a former colleague or because of any bias for or against any supplier. Competition is fearsome. Big players can outspend small players. They can compete on price and features. Small players can still win. Small players can become big players. Skate where the puck is going, not where it was.

How does a government department choose a cloud provider?

Whilst the original aim of G-Cloud was to be able to type in a specification of what was wanted and have the system spit out some costs (along with iTunes style reviews), the reality is that getting a quote is more complicated than that. The assumption, then, was perhaps that cloud services would be true commodity, paying by the minute, hour or day for servers, storage and networks. That largely isn’t the case today.

There are three components to a typical evaluation

1) How much will it cost?

2) What is the range of products that I can deploy and how easily can I make that happen? Is the supplier seen by independent bodies as a leader or a laggard.

3) Do I, or my existing partners, already have the skills needed to manage this environmment?

Most customers will likely start with (3), move to (2) and then evaluate (1) for the suppliers that make it through.

Is there a bias here? With AWS having close to 50% market share of the entire cloud market, the market will be full of people with AWS skills, followed closely by those with Azure skills (given the predominance of Microsoft environments for e.g. Active Directory, email etc in government). Departments will look at their existing staff, or that of their suppliers, or who they can recruit, and pick their strategy based on the available talent.

Departments will also look at Gartner, or Forrester, and see who is in the lead. They will talk to a range of supplier partners and see who is using what. They will consult their peers and see who is doing what.

But there’s no bias against, or for, any given supplier. We can see that when we read about companies who have been hauled over the coals by one department and the very next week they get a new contract from a different department. Don’t read conspiracy into anything government ever does; it’s far more likely to be cockup.

Is there a revolving door?

People come into government from the outside world and people leave government to go to the outside world. In the mid-2000s there was a large influx of very senior Accenture people joining government; did Accenture benefit? If anything, they probably lost out as the newcomers were overcautious rather than overzealous.

Government departments don’t choose a provider because a former colleague or Cabinet Office power broker is employed by the supplier. As anywhere, relationships persist for a period – not as long as you would think – and so some suppliers are better able to inform potential customers of the range of their offer, but this is not a simple relationship. Some people are well liked, some are well respected and some are neither. There are 17,000 people in government IT. They all play a role. Some will stay, some will go. Some make decisions, some don’t.

Also, a bid informed by a former colleague could be better written than one uninformed. This advantage doesn’t last beyond a few weeks. I’ve worked on a lot of bids (both as buyer and seller) and I’m still amazed how many suppliers fail to answer the question, don’t address the scoring criteria, or waffle away beyond the word count. If you’ve been a buyer, you will likely be able to teach a supplier how to write a bid; but there are any number of people who can do that,

There is little in the way of inside information about what government is or isn’t doing or what its strategy will look like. Spend a couple of hours with an architect or bid manager in any Systems Integrator that has worked for several departments and you will know as much about government IT strategy as anyone on the inside.

Do costs escalate (and are suppliers lowballing)?

Once a contract is signed, and proved to be working, it would be unusual if more work was not put through that same contract.

What’s different about cloud is mostly a function of the sift from capex to opex. Servers largely sit there and rust. The cost is the cost. Maybe they’re 10% used for most of their lives, with occasional higher spikes. But the cost for them doesn’t change. Any fluctuations in power are wrapped into a giant overhead number that isn’t probed too closely.

Cloud environments consume cash all the time though. Spin up a server and forget to spin it down and it will cost you money. Fire up more capacity than you need, and it will cost you money. Set up a development environment for a project and, when the project start is delayed by governance questions, don’t spin it down, and it will cost you money. Plan for more capacity than you needed and don’t dynamically adjust it, and it will cost you money. Need some more security, that’s extra? Different products, that’s more as well. If you don’t know what you need when you set out, it will certainly cost more than you expected when you’re done.

Many departments will have woken up to this new cost model when they received their first bill and it was 3x or 5x what they expected. Cost disciplines will then have been imposed, probably unsuccessfully. Over time, these will be improving, but there are still going to be plenty of cases of sticker shock, both for new and existing cloud customers, I’m sure.

But if the service is working, more projects will be put through the same vehicle, sometimes with additional procurement checks, sometimes without. The Inland Revenue’s original contract with EDS was valued, in 1992, at some £200m/year. 10 years later it was £400m and not long after that, with the addition of HMCE (to form HMRC), and the transition to CapGemini, it was easily £1bn.

Did EDS lowball the cost? Probably. And it probably hurt them for a while until new business began to flow through the contract – in 1992, the IR did not have a position on Internet services, but as it began to add them in the late 90s, its costs would have gone up, without offsetting reductions elsewhere.

Do suppliers lowball the cost today? Far less so, because the old adage “price it low and make it up on change control” is difficult to pull off now and with unit costs available and many services or goods being bought at a unit cost rate, it would be difficult to pull the wool over the eyes of a buyer.

Is tax paid part of the evaluation?

For thirty years until the cloud came along, most big departments relied on their outsourced suppliers to handle technology – they bought servers, cabled them up, deployed products, patched them (sometimes) and fed and watered them. Many costs were capitalised and nearly everything was bought through a managed services deal because VAT could be reclaimed that way.

Existing contracts were used because it avoided new procurements and ensured that there was “one throat to choke”, i.e. one supplier on the hook for any problems. Most of these technology suppliers were (and are) based outside of the UK and their tax affairs are not considered in the evaluation of their offers.

HMRC, some will recall, did a deal with a property company registered in Bermuda, called Mapeley, that doesn’t pay tax in the UK.

Tax just isn’t part of the evaluation, for any kind of contract. Supplier finances are – that is, the ability of a company to scale to support a government customer, or to withstand the loss of a large customer.

Is 1/3rd of government information stored in AWS?

No. Next question.

IaaS expenditure is perhaps £10-12m/month (through end of 2018). Total government IT spend, as I’ve covered here before, is somewhere between £7bn and £14bn/year. In the early days of the Crown Hosting business case, hosting costs were reckoned to be up to 25% of that cost. Some 70% of the spend is “keep the lights on” for existing systems.

Most government data is still stored on servers and storage owner by government or its integrators and sits in data centres, some owned by government, but most owned by those integrators. Web front ends, email, development and test environments are increasingly moving to the cloud, but the real data is still a long way from being cloud ready.

Are 80% of contracts won by large providers?

Historically, no. UKcloud revenues over the life of G-Cloud are £86m with AWS at around £63m (through end of 2018). AWS’ share is plainly growing fast though – because of skills in the marketplace, independent views of the range of products and supportability, and because of price.

Momentum suggests that existing contracts will get larger and it will be harder (and harder) for contracts to move between providers, because of the risk of disruption during transition, the lack of skill and the difficulty of making a benefits case for incurring the cost of transition when the savings probably won’t offset that cost.

So what should we do?

It’s easy to say “nothing.” Government doesn’t pick winners and has rarely been successful in trying to skew the market. The cloud market is still new, but growing fast, and it’s hard to say whether today’s winners will still be there tomorrow.

G-Cloud contracts last only two years and, in theory, there is an opportunity to recompete then – see what’s new in the market, explore new pricing options and transition to the new best in class (or Most Economically Advantageous Tender as it’s known)

But transition is hard, as I wrote here in March 2014. And see this one, talking about mobile phones, from 2009 (with excerpts from a 2003 piece). If services aren’t designed to transition, then it’s unlikely to ever happen.

That suggests that we, as government customers, should:

1) Consciously design services to be portable, recognising that will likely increase costs up front (which will make the business case harder to get through), but that future payback could offset those costs; if the supplier knows you can’t transition, you’re in a worse position than if you have choices

2) Build tools and capabilities that support multiple cloud environments so that we can pick the right cloud for the problem we are trying to solve. If you have all of your workloads in one supplier and in one region, you are at risk if there is a problem there, be it fat fingers or a lightning strike.

3) Train our existing teams and keep them up to date with new technologies and services. Encourage them to be curious about what else is out there. Of course they will be more valuable to others, including cloud companies, when you do this, but that’s a fact of life. You will lose people (to other departments and to suppliers) and also gain people (from other departments and from suppliers).

And, as government suppliers, we should:

1) Recognise that big players exist in big markets and that special treatment is rarely available. They may not pay tax in this jurisdiction, but that’s a matter for law, not procurement. They may hire people from government; you have already done the same and you will continue to look out for the opportunity. Don’t bleat, compete.

2) Go where the big players aren’t going. Offer more, for less, or at least for the same. Provide products that compound your customers investment – they’re no longer buying assets for capex, but they will want increased benefit for their spend, so offer new things.

3) Move up the stack. IaaS was always going to be a tough business to compete in. WIth big players able to sweat their assets 24/7, anyone not able to swap workloads between regions and attract customers from multiple sectors that can better overlap peak workloads, is going to struggle. So don’t go there, go where the bigger opportunities are. Government departments aren’t often buying dropbox, so what’s your equivalent for instance?

But, don’t

1) Expect government to intervene and give you preferential treatment because you are small and in the UK. Expect such preferential treatment if you have a better product, at a better price that gets closest to solving the specific problem that the customer has.

2) Expect government to break up a bigger business, or change its structure so that you can better compete. It might happen, sure, but your servers will have long since rusted away by the time that happens.

Laws From Before (the Internet)

Years ago, I spent a happy three years living in Paris. I’d moved there via Germany, then Austria. I didn’t take much with me and the one thing I was happiest to leave behind was my TV. I didn’t own a TV for perhaps a decade.

Each European country I lived in had some quirky laws – that’s quirky when compared with the UK equivalents. For instance, shops in Vienna closed at lunchtime on Saturday and didn’t open on Sunday. The one exception was a store that mostly sold CDs and DVDs, right near the Hofburg (the old royal palace) that had apparently earned the right to stay open, when it sold milk and other essentials, direct to the royal family. It seemed that the law protected that right, even though there was no royal family and it didn’t sell milk.

I was perhaps not surprised to read recently that there are plenty of anachronistic laws covering French TV. For instance

  • National broadcasters can’t show films on Wednesday, Friday or Saturday
  • Those same broadcasters also can’t run ads for books, movies or sales at retailers
  • And they’re not allowed to focus any ads they do show on particular locations or demographics

The French government is considering changing these laws, but not until the end of 2020. Plainly the restrictions don’t apply to Youtube, Netflix or Amazon Prime. Netflix, alone, has 5m users in France. TV is struggling already; and it’s even more hobbled with such laws.

There are, of course, plenty of other more important issues going on that demand the attention of any country’s executive, and so perhaps it’s not a surprise that, even in 2019, laws such as these exist.

But in the digital world where, for instance, in the UK, we legislated for digital signatures to be valid as far back as 2000, it’s interesting to look at the barriers that other countries have in place, for historical reasons, to making progress in the next decade.

Building New Legacy

How do we know the systems we are building today aren’t tomorrow’s legacy? Are we consciously working to ensure that the code we write isn’t spaghetti-like? that interfaces can be easily disassembled? that modules of capability can be unplugged and replaced by other, newer and richer ones?

I’ve seen some examples recently that show this isn’t always the case. One organisation, barely five years old, has already found that its architecture is wholly unsuitable for its current business looks, let alone what it will need to look like as its industry goes through some big changes.

Sometimes this is the result of moving too quickly – the opportunity that the business plan said needs to be exploited is there right now and so first mover advantage theory says that you have to be there now. Any problems can be fixed later goes the thinking. Except they can’t, because once the strings and tin cans are in place, there are new opportunities to exploit. There’s just no time to fix the underlying flaws, so they’re built on, with sedimentary layer after layer of new, often equally flawed, technology.

Is the choice, then, to move more slowly? To not get there first? Sometimes that doesn’t help either – move too slowly and costs go up whilst revenues don’t begin soon enough to offset those losses. Taking too long means competitors exploit the opportunity you were after – sure they may be stacking up issues for themselves later, but maybe they have engineered their capability better, or maybe they’re going so fast they don’t know what issues they’re setting up.

There’s no easy answer. Just as there never is. The challenge is how you maintain a clear vision of capability that will support today’s known business need as well as tomorrow’s.

How you disaggregate capability and tie systems together is important too. The bigger the system and the more capability you wrap into it, the harder it will be to disentangle.

Alongside this, the fewer controls you put around the data that enters the system (including formats, error checking, recency tests etc), the harder it will be to administer the system – and to transfer the data to any new capability.

Sometimes you have to look at what’s in front of you and realIse that “you can’t get there from here”, and slow down the burn and figure out how you start again, whilst keeping everything going in the old world.

The Legacy Replacement Caveat

Yesterday I wrote about the difficulty of replacing existing systems, the challenges of meshing waterfall and agile (with reference to a currently running project) and proposed some options that could help move work forward. There is, though, one big caveat.

Some legacy systems are purely “of the moment” – they process a transaction and then, apart from for reporting or audit reasons, forget about it and move on to the next transaction.

But some, perhaps the majority, need to keep hold of that transaction and carry out actions far into the future. For instance:

– A student loan survives over 30 years (unless paid back early). The system needs to know the policy conditions under which that loan was made (interest rate, repayment terms, amount paid back to date, balance outstanding etc)

– Payments made to a farmer under Environmental Stewardship rules can extend up to a decade – the system retains what work has been agreed, how much will be paid (and when) and what the inspection regime looks like

In the latter case, the system that handles these payments (originally for Defra, then for Natural England and now, I believe, for the RPA) is called Genesis. It had a troubled existence but as of 2008 was working very well. The rules for the schemes that the system supports are set every 7 years by the EU; they are complicated and whilst there is early sight of the kind of changes that will be made, the final rules, and the precise implementation of them, only become clear close to the launch date.

Some years ago, in the run up to the next 7 year review, GDS took on the task, working with the RPA, of replacing Genesis by bundling it with the other (far larger in aggregate, but simpler in rules and shorter in duration) payments made by the RPA. As a result, Defra took the costs of running Genesis out of its budget from the new launch date (again, set by the EU and planned years in advance). Those with a long memory will remember how the launch of the RPA schemes, in the mid-2000s, went horribly wrong with many delays and a large fine levied by the EU on the UK.

The trouble was, the plan was to provide for the new rules. Not the old ones. An agreement could be made with a farmer a week before the new rules were in place, and that agreement would survive for 10 years – and so the new system would have to inherit the old agreements and keep paying. Well, new agreements could have been stopped ahead of a transition to the new system you might say. And, sure, that’s right – but an agreement made a year before would still have 9 years to go; one made 2 years before would have 8 years to go. On being told about this, GDS stripped out the Genesis functionality from the scope of the new system, and so Genesis continues to run, processing new agreements, also with 10 year lives … and one day it will have to be replaced, by which time it will be knocking on 20 years old.

Those with good memories will also know that the new system also had its troubles, with many of the vaunted improvements not working, payments delayed, and manual processes put in place to compensate. And, of course, Defra is carrying the running costs of the old system as well as the new one, and not getting quite the anticipated benefits.

IT is hard. Always has been. It’s just that the stakes are often higher now.

When replacing legacy systems where the transactions have a long life, sometimes there is a pure data migration (as there might be, say, for people arriving in the UK where what’s important is the data describing the route that they took, their personal details and any observations – all of which could be moved from an old system to a new system and read by that new system, even if it collected additional data or carried out differnt processing from the old system). But sometimes, as described above, there’s a need for the new system to inherit historic transactions – not just the data, but the rules and the process(es) by which those transactions are administered.

My sense is that this is the one of the two main reasons why legacy systems survive (the other, by the by, is the tangled, even Gordian, knot of data exchanges, interfaces and connections to other systems).

There are still options, but none are easy:

– Can the new system be made flexible enough to handle old rules and new rules, without compromising the benefits that would accure from having a completely new system and new processes?

– Can the transactions be migrated and adjusted to reflect the new rules, without breaching legal (or other) obligations?

– Can the old system be maintained, and held static, with the portfolio of transactions it contains run down, perhaps with an acceleration from making new agreements with individuals or businesses under the new rules? This might involve “buying” people out of the old contracts, a little like those who choose to swap their defined benefit pension for a defined contribution deal, in return for a lump sum.

– Can a new version of the old system be created, in a modern way, that will allow it to run much more cheaply, perhaps on modern infrastructure, but also with modern code? This could help shave costs from the original system and keep it alive long enough for a safe transition to happen.

Some of these will work; some won’t. The important thing is to be eyes open about what you are trying to replace and recognise that when you reach from the front end into the back end, things get much harder and you forget that at your peril. Put another way, Discovery is not just about how it should be, but about how it was and how it is … so you can be sure you’re not missing anything.

Agile Meets Waterfall

This morning I’ve been looking at the release schedule for a major project launching next year. Chances are that you will be affected by it somehow. No names disclosed though. It’s a replacement for an existing capability and will use an off the shelf, quasi-cloud product (that is, software that is already in use by others and so can, in theory, be configured but that, in practice, will need some development to make it do what is really needed).

The top level project plan looks like this

Spring 2019Discovery
Summer 2019Design and Build
Spring 2020Testing
Summer 2020Go-live
Spring 2021Transition completed
ThereafterContinuous improvement

Plainly the team want to be agile. They want to iterate, adding features and capabilities and seeing how they land before improving them – going round the discovery / alpha / beta / live process repeatedly. But, at the same time, they have a major existing system to replace and they can’t replace just part of that.

They also, I suspect, know that it’s going to be harder than everyone expects, which is why they’re sticking to those nebulous seasonal timeframes rather than talking about particular months, let alone actual dates. The rhythmic cadence of Spring/Summer milestones perhaps suggests an entirely made up plan.

Ever since I first joined the Inland Revenue and heard a team saying that they “hoped to be up to speed by the Autumn” I’ve laughed whenever seasons are mentioned in the context of delivery. It’s as if there are special watches issued with 4 zones marked on them – one for each season.

What do you do when meshing the need to replace an existing widely used system that is no longer doing what it needs to do and that needs radical upgrades, with a new capability that can’t be introduced a bit at a time. This is not a start up launching a new product into the market. They’re not Monzo who were able to start with a pre-paid debit card before moving into current accounts, credit cards etc.

These kinds of projects present the real challenge in government, and in any large organisation:

How do you replace your ageing systems with new ones whilst not losing (too much) capability, keeping everyone happy and not making an enormous mistake?

At the same time, how do you build a big system with lots of capability without it going massively over-budget and falling months or years behind plan?

There are some options to consider:

  • Is a NewCo viable? Can a new capability be set up in parallel with the old, and customers, users or businesses migrated to that new capability, recognising that it doesn’t do everything. This is ideal MVP territory – how much do we need to have to satisfy a given group of customers?Ideally have less complicated needs than the full set of customers, and we can shore up the NewCo with some manual or semi-automated processes, or even by reaching into the old system
  • Can we connect old and new together and gradually take components away from the old system, building them in the new world? This might particularly work where a new front end can be built that presents data far more clearly and adds in data from other sources. It might require some re-engineering of the old system which perhaps isn’t used to presenting data via APIs and has never been described as loosely coupled.
  • Is there standalone capability that will make for a better experience, perhaps using data from the old system (which can be extracted in lots of different ways from cumbersome through to smooth), with that new capability gradually expanded?

None are easy, of course, but they are all likely better than a big bang or the risk of a lengthy project that just gets longer and more expensive – especially as scope increases with time (people who have time to think about what they want will think of more things, and industry will evolve around them giving them more things to think about).

There are, importantly, two fundamental questions underpinning all of these options (as well as any others you might come up with):

1. What do we do today that we don’t need to do tomorrow. Government systems are full of approaches to edge cases. They are sedimentary in nature – built over the years (and decades) with layer after layer of policy change. Today’s system, and all of the capability it has, is not the same as what is needed. This allows a jackhammer to be taken to the sedimentary layers so that the new capability isn’t trying to replicate everything that went before.

2. What do we need to do tomorrow that we don’t do today. This gets the policy arm the chance to shape new capabilities, including simpler policies that could have fewer edge cases, but still cater for everyone. It will also allow new thinking about what the right way to do something is, considering everything that has gone before but also what everyone else is doing (in other industries, other government departments and other countries).

Asking those questions, and working the answers really hard, will help ensure that the solution reflects the true, current need, and will also help get everyone off the page of “we need to replace what we have” when, in reality, that’s the last thing that anyone should be trying to do.

There is, though, one giant caveat to all of this which I will take a look at tomorrow.

Diagnostic Parsimony. Or Not.

Occam’s razor says, roughly anyway, the simplest solution is likely the correct one. Hickam’s dictum says, roughly, and particularly in the medical world, don’t simplify too quickly as there may be multiple causes of your problem.

We have a tendency to be overwhelmed by the latter – so many things to look at, understand and do – and so reach quickly for the former: be agile. Go to cloud. Adopt product X. Do a reorganisation.

The real trick is to apply both, teasing out what the main contributors are, and then applying the solution that makes most sense and that will give the best return for effort expended.

The big problem comes when Occam’s Razor and the oxymoronic phrase “quick wins” are applied together. There are few simple solutions, they’re not quick and there isn’t much to win. If there was, they’d already have been done.

September Summary

I was away for the first few days of September so posted some pictures of what I’ve come to call Deergital Transformation, including this one:

Male fallow deer, aka bucks, a few days after losing their antlers

For much of the rest of the month I looked at the struggle to deliver projects, particularly ones that we sometimes mislabel as transformational, and how we might think about those in different ways:

  • We tend to approach projects as if they are always going to be successful. We go all in, often on giant projects. And yet real world experience in films (2% of films made get to the cinema and only 1/3 of those are profitable)
  • Similarly, Venture Capital companies know that they are going to kiss a lot of frogs before they find their prince or princess. They back new companies in rounds – seed, series A, series B etc – putting in more money as the principles are proven and the company moves from concept to demo to beta to live and to scale. Bad bets are starved of funds, or “pivoted” where the team is backed to do something different.
  • We, all of us, are quick to suggest numbers – a project will cost £100m, or it will take 48 months, or it will save £1bn – but we are rarely open about the assumptions, and, yes, the pure and simple Wild Assed Guesses. In short, all numbers are made up, treat them with caution unless the rationale is published.
  • We all like to set targets, but we don’t always think about the things that have to be done to achieve that goal. By 2040 “we will climb Everest” is fine as an aim, but the extraordinary preparatory work to achieve it needs to be laid out, to avoid the “hockey stick” problem where you get close to the date when you expected to realise the aim, only to find there’s not enough time left. As a regular half and full marathon runner, I know that if I haven’t put the time in before the race, it’s going to hurt and I’m going to let myself down.
  • Replacing legacy systems is hard. The typical transformational project when we take what we have had for the last 20+ years and replace it with something new, and add lots more functionality (to catch up with all of the things that we haven’t been able to do for the last couple of decades) is fraught with risk and rarely pays off. The typical agile model of MVP and rapid iteration doesn’t always align with the policy aspiration, or what the users want, because, on day one, they get less than they have today. New models are needed though, really, they’re old models.

October has started on much the same path, though let’s hope that the real storms seen at the end of last month have gone and that the only October storms are of the digital kind.

Storm over Devon, September 28th

Withered / Weathered Technology

Whilst Shigeru Miyamoto, the public face of Nintendo, is rightly regarded as the leading light of the video game industry, there is another, unsung, hero, also of Nintendo: Gunpei Yokoi.

He pioneered what we loosely translate as “lateral thinking with withered (or possibly weathered) technology” – taking electronic components (be they chips, LCD screens or whatever) that were no longer leading edge and were, in fact, far from that position, and using them to create affordable, mass produced gadgets.

Gunpei Yokoi was behind some extraordinary hits including Game and Watch and then the Game Boy (an 8 bit, black and white, low resolution handheld gaming console released at a time when every other company, including Atari and Sega, was already moving to colour, high resolution displays – if you know the story, you know that the Game Boy dominated the market for years; total unit sales of some 120m and 26 million copies of Tetris, alone).

Arguably that very thinking is behind more recent products – perhaps Nintento’s Wii and Apple’s iPod shuffle.

In the modern rush to harness new technology capability – be it blockchain, machine learning and artificial/augmented intelligence, new databases, new coding languages, new techniques, voice recognition etc – we sometimes forget that there are proven technologies and capabilities that work well, are widely understood and that could be delivered at lower risk.

Real delivery in government requires large scale systems that are highly reliable – you’re front page news if it goes down after all – and that do what’s needed.

Is there, then, a case for putting the new and shiny to one side whilst experimenting with it (of course) to assess its potential, but not relying on it to be at the core of your new capability until it’s ready.

The core systems at the heart of government are definitely both withered and weathered; they’ve been there for some decades. They need to be replaced, but what should they be replaced with?

Technology at the very leading edge, where skills are in short supply and risks are high, or something further back from the bleeding edge, where there is a large pool of capability, substantial understanding of performance and security, and many existing implementations to compare notes with?

Dealing With Legacy Systems

Legacy, that is, systems that work (and have worked for a couple of decades or longer in many cases) both do the lion’s share of the transactional work in government, but also hold back the realisation of many policy aspirations.

Our legacy systems are so entwined in our overall architecture with dozens (even hundreds) of interfaces and connections, and complicated code bases that few understand, that changes are carefully handled and shephered through a rigorous process whenever work needs to be done. We’ve seen what goes wrong when this isn’t handled with the utmost care. There were the problems at the TSB, or at Natwest, RBS and Tesco Bank, for instance.

The big problem we are facing looks like this:

Our policy teams, and indeed our IT teams, have much bigger aspirations for what could be achieved than the current capability of systems.

We want to replace those systems, but trying to deliver eveything that we can do today, as well as even more capability, is a high risk, big bang strategy. We’ve seen what goes wrong when we try to do everything in a single enormous project, whether that be the Emergency Services Network, the e-Borders programme, Universal Credit etc.

But we also know that the agile, iterative approach, results in us getting very much less than we have today with the promise that we will get more over future releases, though the delivery timetable could stretch out some time with some uncertainty.

The agile approach is an easy sell if you aren’t replacing anything that exists today. Monzo, the challenger bank, for instance, launched with a pre-paid debit card and then worked to add current accounts and other products. It didn’t try and open a full bank on day one – current accounts, debit cards, credit cards, loans, mortgages etc would have taken years to deliver, absorbed a fortune and delayed any chance to test the product in the market.

It’s not, then, either/or but somehow both/and. How do we deliver more policy capability whilst replacing some of what we have and do so at a risk that is low enough (or manageable enough) to make for a good chance of success?

Here’s a slide that I put up at a conference in 2000 that looked at some ways that we might achieve that. I think there’s something here still – where we build (at the left hand end), some thin horizontal layers that hide the complexity of government … and on the right hand side we build some narrow, top to bottom capabilities and gradually build those out.

It’s certainly not easy. But it’s a way ahead.

Why Do Change Projects Fail?

One aphorism has it that culture eats strategy for breakfast. If you don’t get the people across your organisation on side, your change project will fail, no matter how much you throw at it.

When I’ve reviewed major change projects during implement it’s typically hard to find anyone who isn’t on board and committed to making it happen.

When I’ve reviewed other projects, after the fact, when the dust is still settling but failure is clear, it’s often hard to find anyone who didn’t think it was doomed to failure.

Why don’t those sane people speak up before the project is finalised, or as its underway?

This could be because of something called “preference falsification”, a reluctance to contradict your peer group and call out the problem. If everyone around you is on board, then being the odd one out is difficult. Of course, if everyone around you thinks the same as you, and is reluctant to call it out, then you all end up agreeing that it will fail but unwilling to say so.

If one person is honest, the dam can break. This, really, is the theory behind The Emperor’s New Clothes.

Establishing an environment where ideas are exchanged freely, criticism from inside and outside reflected on and addressed directly (whether that results in a change in approach or not) is diffcult. But it is essential. Not everything works. Not every idea is right. The experience around you in any organisation often knows what will and won’t work and, properly harnessed, is the difference between success and failure.