I’ve run into a few people recently who’ve told me that the Y2K problem, aka The Millennium Bug was a hoax. In some ways the issue was, but let’s get one thing straight, the bug was very real and if we hadn’t done a hell of a lot of work to fix it, things would have gone catastrophically wrong.
What was the problem then? In the 1950s every tiny piece of computer storage was critical. Programmers were always looking for ways to store and process information more efficiently. They didn’t think for one moment that their code would ever have to deal with the year 2000, so they decided to lop the “19” off the front of the year and just store the last 2 digits. 1958 was actually stored as “58”. If the user needed to see the full year then many systems simply printed “19” before the 2 digit year.
This wouldn’t have been much of a problem if it hadn’t made it out of the 1950s. Unfortunately every new generation of the tech industry builds on previous generations. Not only did the 2 digit year become a kind of industry standard, it also got baked very deeply into the code that actually ran the computers themselves.
By the time the 1990s rolled around there was an awful lot of computer code about and people started to realise that a lot of it was going to have to deal with the year 2000.
Suddenly You Find You’re Not Insured…
Let’s look at an example. Let’s say you renew your car insurance. The new policy starts on January 2nd, 1999. Now, you’ve been lucky, this computer program uses 4 digit years so you correctly see your expiry date as January 1st, 2000.
Unfortunately the database that all the records are stored in only uses 2 digit years, so the system writes a start date of 02/01/99 and an expiry date of 01/01/00 into the database.
The problem is obvious: when that record is read back the system will correctly convert 02/01/99 to January 2nd, 1999, but it will wrongly convert 01/01/00 to January 1st, 1900. Congratulations, as far as that computer system is concerned you’re not insured.
In that simple example you’d hope that, at some point, a human would see it and realise something had gone deeply wrong. The problem is that, even in 1999, there was an awful lot of processing going on, in financial systems even in safety critical systems, before the results ever got anywhere near a human.
The Ariane 5 rocket explosion was caused by a similar problem. The guidance system was capable of producing a much higher number than the main computer could deal with. This hadn’t been a problem on Ariane 4 because it couldn’t do anything to cause such a number to be generated. Ariane 5 however could and 37 seconds after main engine ignition on June 4, 1996, it did, ultimately causing the rocket to self-destruct.
That’s why we had to fix the Y2K bug, because pretty much everywhere there was a date in computer code there was potential for things to go badly wrong.
It Wasn’t Just Dates…
What’s more, it wasn’t just the obvious cases we had to worry about. There were more subtle implications of the bug. Consider the following output from a little example program I wrote. It gives you the expected arrival time of a plane and its current altitude both in feet and metres.
SIGN DATE TIME ALT(m) ALT(ft)
Y2K00 1990/11/01 00:00 5000 16384
Y2K01 1991/11/01 00:35 4900 15872
Y2K02 1992/11/01 01:10 4800 15616
Y2K03 1993/11/01 01:45 4700 15360
Y2K04 1994/11/01 02:20 4600 14848
Y2K05 1995/11/01 02:55 4500 14592
Y2K06 1996/11/01 03:30 4400 14336
Y2K07 1997/11/01 04:05 4300 14080
Y2K08 1998/11/01 04:40 4200 13568
Y2K09 1999/11/01 05:15 4100 13312
Y2K10 19100/11/01 05:50 4000 49
Y2K11 19101/11/01 06:25 3900 49
Y2K12 19102/11/01 07:00 3800 49
Y2K13 19103/11/01 07:35 3700 49
Y2K14 19104/11/01 08:10 3600 49
There’s one thing you might expect, that when it got to the year 2000 it printed out 19100. The program stores the date as 2 digits and simply prints “19” in front of them. That was a pretty typical Y2K bug: the 2 digit year ticks over from 99 to 100 and it gets printed as “19100”.
What might be surprising is that after the year 2000 it completely screws up the calculation of how high the plane is in feet. The calculation before the year 2000 is (approximately) right. Afterwards it just prints “49” however high the plane is.
This is because, when I wrote the program, I only allocated enough storage for 2 figures in the year. When it came to after the year 2000 however, the program wrote 3 figures regardless. What it did was to write the extra “1” to some storage that was being used for something else – in this case to store the height in feet. 49 is the value a computer would send to the screen if it wanted to print the number 1.
Again, in my little program this gets printed to the screen and you’d hope that someone would notice. What it highlights however is that the problem caused can be somewhere else in the code and affect something other than just the date. This corrupted value could be the radiation dose of a chemotherapy patient and it might never get seen by a human before its delivered…
I hope that makes it abundantly clear that the Y2K bug was very much real and that the consequences could very definitely have been catastrophic. The idea that the bug could have caused planes to fall out of the sky is not and was not scaremongering. It was entirely possible. Indeed if we had somehow sleep-walked through to the closing minutes of 1999 without realising there was a problem it was a relatively likely consequence. We did however realise and we did a hell of a lot of work to fix the problems.
Now of course it’s true that the press over-hyped the situation. The headline “Renowned industry expert says that thanks to years’ worth of effort it’s now exceedingly unlikely that there will be any critical incident in the aviation sector” doesn’t make much of a headline. “Boffin says planes could fall from sky” is going to sell many more newspapers.
On the back of that hype there was also the predictable bunch of spivs and con-merchants offering to Y2K-proof your toaster. I’m sure you get my point; some people capitalised on the ignorance and panic by spreading more misinformation and making a pretty penny out of fixing things that didn’t need fixing.
That doesn’t lessen however the seriousness of the real underlying problem. It was, as they say, “a biggie”.
So It Definitely Wasn’t a Hoax… Or Was It?
There is however a certain thread of logic that says, even considering everything I’ve written, it was still a hoax. It’s a line of argument I actually quite like. For the tech industry it certainly wasn’t a hoax, it was very real indeed. For the government too – the government needed to make sure that adequate provisions were being made to fix it, to mitigate any remaining risk and deal with any problems arising.
As far as the general public were concerned however, they were never actually exposed to any significant level of risk. It was inevitable that we – the tech industry – would fix all of the serious issues well before they came into play. There was nothing that the people on the Clapham omnibus needed to worry about. In fact, being perfectly brutal about it, there wasn’t really any need for them to ever know about the problem at all.
Much as I like it, I don’t entirely subscribe to that school of reasoning. Even as midnight ticked over we couldn’t be sure that we’d fixed every critical bug. There was still a risk of things going badly wrong and the general public needed to be aware of that.
There’s also an argument that it was public awareness that actually made a lot of the tech industry sit up and take notice :- that’s when the senior management of these businesses finally realised that what the technical people were saying was right.
Did we need people predicting that planes would fall from the sky and toasters would stop working though? No, we definitely didn’t. What we needed was common sense. What we got was the British Press.