On October 4, 2021, the social media giant Facebook, along with its services Instagram and WhatsApp, experienced a major outage that lasted several hours, sending both social media enthusiasts and, more importantly, businesses relying on social media into a state of panic. “Was it a malicious hacker attack? Is Facebook down for good?”
The truth turned out to be much less dramatic than the speculations. These few precious hours, which cost thousands of businesses millions of dollars, were lost because of a minor human error and the fragility of the Internet as a whole. Because, despite the fact we see it as powerful and unbreakable, the Internet is truly very, very delicate.
To understand what happened, we would like to simplify a bit and try to compare the Internet to something that you might find much easier to understand – the Maltese road system.
The Internet of Malta
The Internet could be compared to Malta’s towns. Every Internet service provider and every major enterprise, such as Facebook, is like a separate town. They have their own internal network of roads that they know well and they have bypasses and smaller roads that connect them to their neighbours. Luckily, we all have Google Maps, or even physical maps and our own human memory, that help us navigate and for example, easily find the best way to get from L-Imtarfa to Marsaxlokk.
However, imagine that there is no Google Maps and you can’t even get a physical map at a shop. There are no road signs that show directions. And imagine that for every trip, every driver starts fresh and remembers nothing about their previous routes. How would such drivers be able to find their way? If they tried doing it by pure chance, they would be stuck forever in endless loops and never get to their destination.
To make connections between towns and villages possible, imagine that every local council communicated with their direct neighbours daily (or even more often) and told them of any new roads built, any construction on existing roads, basically showing them the best ways to get through the town. Each road in the town would have a certain value assigned to it, which would represent how wide the road is, what is the quality of the tarmac, and how jammed it is in the rush hour. And all this information would be freely available at any time to any driver directly from the originating local council.
With such a system, someone making their way from L-Imtarfa to Marsaxlokk would go to the Mtarfa local council and pick up the current information about the best route to Marsaxlokk. This route would be based on information received from Rabat, which in turn Rabat received from Iż-Żebbuġ and other direct connections, etc.
The Village That Disappeared
Now, imagine if the Marsaxlokk local council made a mistake and on October 4, 2021, during a routine update of the routes, they sent out information to their neighbours that due to road construction, to get to Marsaxlokk, you have to turn around at the Triq Iż-Żejtun roundabout, and that the best route is through Triq Iż-Żejtun (not via Qajjenza). This would effectively send every single car going to Marsaxlokk from Żejtun back on its way to Żejtun in an endless loop.
This wrong information would then be propagated from Żejtun to Żabbar, from Żabbar to Raħal Ġdid, making its way almost instantaneously through all of Malta. And from this moment on, Marsaxlokk would effectively disappear. Cars coming from Marsaxlokk to other towns would of course find their way out (cause other towns made no mistakes in their routing) but no incoming traffic would be possible because every car would be directed to Triq Iż-Żejtun and then turned around at the roundabout.
This would, of course, be noticed immediately by the Marsaxlokk local council. However, imagine that John, the person who had the key to the Marsaxlokk local council, went out to eat dinner in Birgu and left Marsaxlokk with the key. And nobody else in Marsaxlokk would have the key to the local council building. John would be unable to return to Marsaxlokk to correct his mistake until some kind of an emergency “hack” was made – either someone would pick the lock or John would walk back to Marsaxlokk instead of driving a car.
Small Error, Big Price
This is exactly what, supposedly, happened to Facebook on October 4, 2021. During a routine update of BGP information (Border Gateway Protocol), which was then sent to all the neighbours of the Facebook internal network, someone made a mistake in the routing tables. As a result, no packets were able to reach the Facebook network from the outside. And the people with physical access to the network did not have the access rights to send a corrected update – the mistake effectively shut out those who would be able to correct it.
This small error is not that uncommon, situations like this have happened before and they keep reminding us that the Internet is, in reality, very fragile. One wrong number and you’re cut off. What we are hoping is that this helps you appreciate all the hard work that Internet service providers and enterprise network administrators are doing every day, under a lot of pressure, to make sure that you can maintain your e-business or spend leisure time online.