Automation for IT, not just for everything IT touches

When I was over in San Francisco last week one of the main themes of my meetings was how to improve the manageability of the various systems that I run for the UK government, whether it’s the Gateway (probably north of 200 servers by now), UKonline (probably about 40 including the test environments) or any of the others. It seemed to me that although we were making great strides in increasing CPU power, adding wonderful new functions and ever increasing storage amounts, life was getting harder for the operations people, the systems engineers and so in the end, for me, the customer.

In the recent past lots of people have wanted to sell me “blades” – lots of processors packed into a rack. I think I could probably get the whole Gateway into perhaps 2, maybe 3, racks (excluding the comms front and back ends) if this stuff really worked. But, I’ve been sceptical (partly because they’re new and partly because having more CPUs in a smaller space increases the risk of cockup – and, per Blackadder, “We’re not at home to Mr. Cockup”, but seem to spend our time endlessly preparing the spare room).

So, I was keen to see people who could do something to help – whether it was better management tools, software that would help deploy common configurations, systems that would reduce our dependence on adding every patch that’s necessary (and reduce the risk of being caught out by something that exploits a patch that’s not yet available) and so on. To that end, Bernie Frieder (late of San Francisco, the dti and now, I’m delighted to say, at OeE) set up some sessions for me, Simon Freeman (the most technically capable person I’ve met) and the e-Envoy himself, Andrew Pinder.

– Naturally we went to see Marc Andreessen at Opsware along with Insik Rhee (who founded Keva before it was bought by Netscape and knows a thing or three about software) and Ben Horowitz (who ran Loudcloud with Marc and Insik and now is the CEO at Opsware). Marc and I have shared a stage in the past and also had dinner a couple of times. He’s got some good insights into what’s coming next and also keeps a wider brief – from whether the UK will form a department of homeland security equivalent, to a story about how he’s just installed 3 terabytes of storage at home so that he can keep a library of HDTV programmes available! Since EDS bought the hosting arm of Loudcloud, the Opsware folks have been busy making the software deployable on a disk (similar to my plans for ‘DotP on a Disk’, which I ought to cover another time). They already have their first few customers and will be adding more with subsequent releases – progress looks good; the company is well-funded (probably better than almost every other ‘startup’ in the valley; the people motivated and they have lots of ideas, with the track record to back them up. We already use Opsware to manage UKonline and I was keen to see how it would evolve, to also support Linux and Microsoft platforms. There’s a lot coming. Marc is also focused on the issues of how to manage large configurations (as you’d expect given what Opsware does) and recently went to print to state his case, some quotes that stuck out …

“Servers and applications are glued together using piece-parts, bailing wire and chewing gum … With today’s Web applications requiring dozens, and in some cases hundreds, of servers to run, IT just can’t keep up. The solution is going to be found in automation and utility computing to make IT as easy to use and run–and as hands-off–as the phone system … Those that don’t take action today, when they have the luxury of taking the time to do it right, will find themselves unable to keep up with their competition when the economy starts to grow again”

– Later in the trip we went to visit a “new” company or, at least one that has recently emerged from stealth mode. The new name for the company (the old name was “company 51” which I thought was way, way better) is Sana Security. The very smart people there want to make other security systems obsolete – by preventing any attack, whether it is known or unknown. The founder, Steven Hofmeyr, had the idea that if the human immune system worked the same way today’s security systems work then we’d all have been extinct long ago. So he set out to apply some of the same principles and has come up with software that watches what’s going on and determines whether it’s part of the “normal” behaviour for a system – anything that is abnormal can be shutdown, alerted, quarantined etc. I’m purposely saying little here – I’ve checked the website and it is similarly vague. The software is still in its early days, but I think Steven is onto something – I will be waiting eagerly for something that we can try out on our systems.

– We also saw a more mature company, Gilian, who’s main product is the “G-server”. Their game is not so much in preventing hacks (they reason that, one way or another, there’s always a chance someone will get through – whether from the outside or (more likely) the inside), but in making sure that the hacker cannot change the website and post defamatory or misleading information. Clever stuff and, arguably, essential for any site, whether they think they have security nailed down or not. How much is your reputation worth as a company? The hacker boards are full of examples of sites that have been “updated” maliciously, so here’s a way of managing that risk.

That’s a quick sample of some of the stuff that I saw last week. I’ve got lots more on my mind following the trip and I’ll post that between now and the end of the year.

Leave a Reply