What went wrong during the UK’s crippling air traffic control failure?
Was the flight plan that triggered the IT shutdown for an Air France San Francisco-Paris departure?
Sign up to Simon Calder’s free travel email for expert advice and money-saving discounts
Get Simon Calder’s Travel email
Nine days after the UK’s air-traffic control system failed, the company responsible has explained the cause of the problem – a sequence of events that had simply not been foreseen.
Nats – the national air-traffic provider for the UK – has released a report about the failure, which led to the cancellation of more than 2,000 flights.
What happened?
At 8.32am on bank holiday Monday, 28 August, the Nats air-traffic control (ATC) system received a flight plan for a transatlantic jet that was due to overfly the UK.
The flight plan had been handed on routinely. Eurocontrol’s Integrated Initial Flight Plan Processing System. Eurocontrol, based in Brussels, is the pan-European organisation that coordinates air navigation.
Normally the Nats system extracts the relevant UK portion of the flight and presents it to air-traffic controllers (ATCOs). But data in the flight plan triggered the shutdown of the entire system and its back-up (technically, they were both put into “maintenance mode”).
The report says the system “was unable to establish a reasonable course of action that would preserve safety and so raised a critical exception”.
A “critical exception” is the planned last resort – the point at which the affected system cannot continue.
Without the automated system to assist them, controllers could handle far fewer flights – just 15 per cent of the normal flow.
While aircraft that were in flight were able to continue without diverting, most planes were kept on the ground to avoid adding to the controllers’ workload.
What was wrong with the ‘rogue’ flight plan?
The root of the problem was that it contained duplicate “waypoints”. These are specific locations on the surface of the Earth with five-letter names. For example, flying over the Isle of Wight towards the London area, pilots typically traverse KATHY, ABSAV and AVANT. There is a finite number of combinations, and some are duplicated.
Flight plans for aircraft overflying the UK, as this one was, must contain a waypoint where the pilots intended to enter British airspace. They need not contain a waypoint at the exit point from UK skies. The Nats system is programmed to search on a database for the nearest waypoint beyond British control. It appears that this was a duplicate of another waypoint in the flight plan.
“Since flight data is safety critical information that is passed to ATCOs [air-traffic controllers] the system must be sure it is correct and could not do so in this case,” the report says. “It therefore stopped operating, avoiding any opportunity for incorrect data being passed to a controller.”
The report says both the main and the back-up systems that normally allow thousands of flights to fly to, from and over the UK stopped working. The systems saw something they didn’t like and went into “maintenance mode”. They had both, according to Nats, “failed safely” – the first time this had happened. The entire process from normal to failure “took less than 20 seconds”.
Which flight was responsible?
The report does not reveal the airline or route, merely saying: “The flight was planned to depart at around 4am [BST] on 28 August, and arrive at around 3pm.”
The service that most closely matches these timings, and passed over UK airspace, is Air France flight AF85 from San Francisco to Paris CDG. It is scheduled to depart daily at 4am British time and arrive in the French capital at 2.50pm, BST. This is speculation by The Independent and has not been confirmed.
The Nats report says: “This specific flight plan, with its associated characteristics (including duplicate waypoint names), has never previously been filed.”
What was the effect?
Disruption ripples very quickly through aviation, especially at busy airports. London Heathrow and Gatwick, the two biggest UK airports, are particularly susceptible. Cancellations began immediately.
As a result of the system failure, almost 1,600 flights were cancelled on Monday – grounding around 250,000 travellers.
On Tuesday, around 300 departures were cancelled as airlines struggled with aircraft and crew being out of position.
Over the following days, the number of cancelled flight topped 2,000. Many other flights were heavily delayed. Ryanair said it suffered more than 1,500 flight delays on 28 and 29 August, affecting over 270,000 passengers.
Tens of thousands of passengers spent Monday night sleeping on airport floors; many more travellers saw their holidays abruptly cancelled. Cancellations continued for days as airlines struggled to recover their schedules.
Passengers faced extreme price rises for alternative transport and for hotel stays, which they can reclaim from the airlines.
Carriers face estimated losses of £100m, mainly comprising care costs and lost revenue.
Was safety compromised?
No. Passengers were never in danger. Time and again in the report, the point is made that the reason aviation slowed to a trickle was to ensure controllers could keep flights safe.
Could it happen again?
Not the same circumstances. The problem, and a straightforward fix, are known.
I put it to Nats that there was a possibility that another chain of events could lead to a closedown of the automatic system, and executives confirmed that it could.
I heard it was a cyber attack?
The report says: “We can rule out any cyber-related contribution to this incident.”
What do the boffins say?
One IT specialist told The Independent: “The problem appears to have been wholly down to duff software written by a contractor to Nats. The algorithm was fundamentally poor.”
Nats has been invited to respond.
Another said that a well-designed testing regime should have identified the issue long before it caused a problem.
What happens next?
The Civil Aviation Authority (CAA), which oversees air-traffic control operations in the UK, is launching its own review into “the wider issues around the system failure and how Nats responded to the incident”.
Rob Bishton, joint-interim chief executive at the CAA, said: “The initial report by Nats raises several important questions and as the regulator we want to make sure these are answered for passengers and industry.
“If there is evidence to suggest Nats may have breached its statutory and licensing obligations we will consider whether any further action is necessary.”
What form could ‘further action’ take?
The aviation minister, Baroness Vere, made it clear in a House of Lords debate the airlines would not be able to claim from Nats.
She said: “There is no mechanism by which airlines can seek financial compensation directly from Nats in this circumstance.”
But Nats could pay for its failure later, the minister added. “There are incentives for Nats linked to its performance; failure to reach target levels may incur penalties and reduce the charges paid by airlines,” she said.
“There is also a mechanism to reduce charges in subsequent years to the airlines because of poor performance.”
Were charges to fall, in theory airlines might pass on part of the saving to passengers.
Lower charges are one thing – but what about the £100m the airlines lost?
The Nats report says: “It is not within Nats’ remit to address any wider questions arising from the incident such as cost reimbursement and compensation for the associated disruption.”
What do the airlines say?
Michael O’Leary, chief executive of Ryanair said: “This whitewash report, which understates the number of flights cancellations and flight delays, fails to explain why one inaccurate flight plan brought down not just the Nats ATC system, but also the backup system.
“Nats should explain why its backup system failed, and what they are doing to introduce an effective backup system, rather than the rubbish they have at the moment.
“We do not accept Nats claim that it is ‘not within remit’ to provide cost reimbursement to customers. Ryanair pays Nats almost €100m [£86m] per annum for an ATC service that is repeatedly short staffed and on 28 August collapsed altogether.
“The least Nats could and should do is to reimburse its airline customers for the tens of millions of pounds they have spent reimbursing passengers for their hotel, meals and transport expenses.”
The chief executive of easyJet, Johan Lundgren, said: “It is vital lessons have been learned so passengers never see a repeat of an incident on this scale. A full independent and wide-ranging review of Nats is needed to ensure it is fit for purpose today and in the future and so we welcome the CAA's planned review.”
Tim Alderslade, chief executive of Airlines UK – representing British carriers – said: “It is concerning that a small fault with the data could lead to such a dramatic impact on passengers and operations and that it took so long to rectify. Lessons must be learnt to ensure it doesn't happen in the future.
“Airlines worked round the clock in response to the situation, providing accommodation to passengers and putting on more flights to bring them home as quickly as possible, at huge cost to all carriers impacted.
“Airlines cannot be the insurer of last resort though and there must be accountability from Nats when things go wrong.
“Airlines are seeking clarity on what options exist for Nats to cover our costs under the current legislation and will continue to engage with Government on all options for redress. We can’t have a situation whereby airlines carry the can every time we see disruption of this magnitude.”