No Man's Land
Page 20
Their report is 291 pages long, and the language is very technical, in keeping with the extremely technical failures of the QF72 accident. These are powerful computers, processing data and controlling all the functions and systems on the aircraft. The findings are based on three years of analysis of the recorded data from the black boxes and extensive coordination with the two manufacturers involved in the investigation, Northrop Grumman and Airbus.
The findings are a revelation to me because the systems’ complex behaviour and the commanded manoeuvres by the flight-control computers are mostly transparent to the pilot. The barrage of failures that occurred over a seven-and-a-half-minute period are summarised in a report that took three years to complete.
*
I arrange a face-to-face discussion with an ATSB investigator, Sam, to review the report and to finally listen to the cockpit voice recorder. Over the past three years, the ATSB team have been extremely professional and cordial, always available to answer my questions and hear my concerns.
In the preamble of the draft report, the role of the ATSB in any investigation is described:
It’s not a function of the ATSB to apportion blame or determine liability. At the same time, an investigation report must include factual material of sufficient weight to support the analysis and findings. At all times the ATSB endeavours to balance the use of material that could imply adverse comment with the need to properly explain what happened, and why, in a fair and unbiased manner.
This is the starting point of my discussion with Sam. The ATSB aren’t there to blame, but to analyse and report objectively in the interests of safety.
Sam is an intelligent, softly spoken man with enormous patience. As he methodically steps me through the report and explains the complexities of the failures, my initial assessment of the extreme nature of the accident is validated: this highly automated, complex machine generated complex failures that were unforeseen and untested during its development.
Sam and I match the timeline of failures with the timeline of the flight. Remember that I returned to duty at time 1239 in Perth. What caused the first pitch-down?
The first pitch-down occurred two minutes after I returned to the flight deck. A lot happened in those two minutes – the shit started to hit the fan as soon as Ross returned from the toilet.
Hidden from the pilots, corrupted data was being randomly generated from ADIRU 1 in both the air-data and inertial reference partitions of this computer processor. The report labels them as ‘data spikes’. These caused the autopilot to disengage.
The first of many aircraft systems started to register faults, and the failures started to appear on our centre displays. The corrupt data then satisfied the criterion to generate stall and overspeed warnings. My airspeed and altitude displays were also affected, making them jump up and down on my primary display. This caused confusion but we addressed some of the faults while I was forced into manual flight due to the unreliable speed situation.
In the two minutes before the first pitch-down, twenty-two master caution chimes, seven overspeed warnings and ten stall warnings were recorded.
I wasn’t counting them but the aircraft’s recorders were. You can do the maths: thirty-nine warnings and cautions in two minutes, with stall and overspeed warnings indicating a serious flight condition that requires pilot action to remedy. Are you kidding?! As mentioned previously, these warnings and cautions couldn’t be silenced, which generated a tsunami of confusion and sensory overload.
During these two minutes, I was ‘sitting on my hands’ and trying to analyse what was happening. I knew I wasn’t stalling or overspeeding because, seconds before, we’d been in a stabilised flight condition, and nothing had changed to upset that stability. I could also confirm that we weren’t at the stall angle of attack, despite the incessant stall warnings, because I could look outside and visually verify that our flight path was safe. I wasn’t overly concerned at this time because the Airbus technical manuals don’t mention that erroneous stall warnings can generate a violent computer response. Perhaps they should.
Sam and I discuss my concerns and actions with the unreliable speed problem. There was a rapid, memory-based procedure for this condition, and the first step was to disengage the autopilot and manually fly. I didn’t want to remove the assistance of the autopilot while I was trying to find the causes of our problems, but it was a requirement. The first pitch-down would have occurred whether the autopilot was engaged or not.
At the speed of light, myriad data transmissions, computations and computer commands were flowing through the wiring between the air-data computers, the flight-control computers (PRIMs) and the aircraft’s elevator control system. The flight-control computers were looking closely at the bad angle-of-attack data spikes being produced in ADIRU 1.
Because it was extreme, the PRIM activated a ‘testing’ algorithm within its software, to test that value for authenticity. The extreme rate of false data spikes streaming from ADIRU 1 satisfied this algorithm. The erroneous value of angle of attack passed the test and was confirmed as ‘valid’ by the flight-control computer.
Then the PRIM activated two modes to protect the aircraft from a false conclusion that it was about to stall. A nose-down command was transmitted to the aircraft’s elevator in the tail. It forced the nose down in an abrupt, accelerating, pitching motion, then abruptly it commanded the nose up.
Classified as a pitch-down, it was like a boxer’s one-two punch or one of those novelty drinking birds, but on steroids. The computer’s pitch commands were rapid and violent, generating destructive g-forces and altitude loss.
My sidestick was blocked and ineffective for 1.8 seconds. The pitch-down generated a loss of 150 feet in that time.
In total, I used 690 feet of altitude to recover and level off before climbing back to 37,000 feet.
One of the flight-control computers had faulted during the first pitch-down, PRIM 3. The stall warnings were silent during this faulted period until it was reset.
The data spike production from ADIRU 1 continued, the warnings recommenced and our confusion was amplified.
The second pitch-down wasn’t as violent as the first, but it still generated an altitude loss and a lifting g-force.
The investigation provides evidence that the PRIM algorithm again tested the data spikes and confirmed the value as valid. The computers sensed a threat to the aircraft, and the data spike behaviour forced a change in the level of protection available from the flight-control system.
As a result, only one protection was activated.
The aircraft’s data shows my sidestick was blocked for 2.8 seconds until the completion of the protection manoeuvre. I regained control in my sidestick and started climbing back to 37,000 feet. We lost 400 feet during the dive and recovery.
The flight-control computers had reconfigured themselves automatically during the pitch-downs after fault conditions were generated. PRIM 3 faulted again and was displayed to the pilots. The level of aircraft protection was changed to Alternate Law because of this transparent reconfiguration, but it wasn’t displayed to us at the time. No information was displayed to notify the pilots of protection mode activation, data spiking or the specific protections activated. (In the report, the explanation describing this reconfiguration is complex.)
The confusing stall and overspeed warnings continued.
I notice that the automatic pitch trim wasn’t working after the second pitch-down, and that I had to operate it manually. Sam explains that the loss of the automatic pitch trim was due to the reconfiguration of the flight-control computers. Its failure wasn’t communicated to the pilots.
The master caution chime was activating continuously in response to multiple system failures and faults. The fault messages were displayed momentarily and then disappeared, each replaced by a new fault. We couldn’t action any of them. Sam explains this was due to the rejection of air data from ADIRU 1 by many of the aircraft’s automated systems. I comment that this was an
enormous addition to our workload.
QF72 then landed safely at 0532.
‘How exactly were the data spikes produced?’ I ask Sam.
He explains the technical failures in the Airbus software (Section 2 of the report) and the Northrop Grumman air-data computer (Section 3). The discussion is very technical, and I won’t attempt to describe it. The most important thing to note here is that the investigation reached two conclusions: without one failure, the other would not have occurred. The air-data computer generated extreme data and sent it to the flight-control computers. The PRIMs tested it then used it to injure 119 people on QF72. This is a simple summary for a tangled mess of failures.
Airbus has since been able to identify these particular flaws in their software and modify their algorithms so that this specific failure scenario won’t happen again. The investigation couldn’t pinpoint the reason why the Northrop Grumman air-data unit took good data, corrupted it and hid its faulty operation from the pilots. That remains unanswered.
One thing is certain: if the faulty unit had identified itself and been turned off, the accident would have been averted. But there was no procedure in place to address this unforeseen failure sequence in the ADIRU and its effect on the flight computer’s software, because it had not been fully tested during its development. There is a procedure now, but it isn’t an intuitive action like pushing a button on the control stick or forward instrument panel.
In 2010 Airbus ended their partnership with Northrop Grumman, and switched to the Honeywell corporation for the supply of air-data computers. Likewise, starting in 2011, Qantas replaced all their Northrop Grumman air-data computers with those from Honeywell. It took a couple of years to complete the refit.
We survived a perfect storm of automation failure. I’ve already concluded there was nothing more I could have done to stop the injuries dished out by my confused Airbus A330, and this debrief with Sam has confirmed my assessment – but I don’t find comfort in that realisation. This accident damaged me and exposed me to the unforgettable images of my injured passengers and crew. The closure I’ve desperately sought is denied.
The discussion of the ADIRU operation and the complexity of the PRIM algorithms does not renew my confidence in the Airbus design. The achievement of the pilots working together to save QF72, while noted, is not redeeming. I already knew we had come back from the brink.
*
It’s time for me to listen to the cockpit voice recorder. It contains all the conversations, sounds and warnings in the cockpit and from our radio transmissions, and also from inside the cabin through the intercom system. The recording is a protected document in Australian aviation investigation and legislation, and I honour that element of privacy; in many countries, it isn’t protected.
As I sit there with Sam, I recall the emotional effect the recording had on Pete and Ross when they first listened to it. Sam warns me of its confronting content before he pushes ‘play’.
My recollection of the flight is different to what I’m hearing from the recorder. I’ve told many friends that I was calling this Airbus every bad word in the book as its malfunctions threatened everyone on board. But I don’t hear any of this from the recorder; in fact, if you were a casual observer unaware of what was happening in the cockpit or with the aeroplane, you’d be excused for thinking that Ross and I were sitting on a park bench, quietly feeding the pigeons. We sound calm, professional and focused, despite the massive doses of adrenalin coursing through our bodies. Ross’s ‘What the fuck was that?!’ is recorded truthfully and accurately, however.
The cacophony of warnings and alerts is captured. Each one required recognition and reaction, but which ones were real and required action, and which ones weren’t?
The most chilling sound is the crash I heard during the first dive. It sounds like a 40-foot shipping container full of plates and glasses smashing into a brick wall. I can clearly hear the passengers screaming as they were catapulted into the ceiling. I can hear the forward galley’s meal carts and storage boxes exploding onto the floor as the plane lurches down. The screams stop abruptly; the passengers had hit the ceiling, stunning them, before they were thrown onto the floor. Their screams are replaced by moans and sobs. It’s a morbid recording.
Sam is watching me closely. I take a deep breath and exhale. It was hell in the cabin, a terrifying hell. It was an ambush from an old western movie, and the settlers were being scalped alive.
But, despite the confronting nature of these sounds of pain and despair, I need to hear them.
‘Yeah, Sam. It was hell,’ is all I can muster in response to his inquisitive gaze.
He fast-forwards to a particular point that he has marked for my listening: Pete was talking on the satphone, Ross was on the intercom talking to the cabin crew and reorganising our flight libraries, and I was manually flying while talking to Melbourne Air Traffic Control. The ever-present dings and warnings are recorded too.
Sam stops the tape. ‘Your workload here is enormous,’ he remarks.
‘Yes, it was. We had to separate to complete these independent tasks but everyone still had to crosscheck everything I was doing while I was manual flying, and to try and keep track of the information being received from the maintenance unit and the cabin crew. All those tasks were required to aid our preparation for approach and landing.’
Unfortunately, this workload can’t be replicated in the flight simulator; if you can handle this scenario then you can handle anything.
Of course, I remembered almost everything that we did and said, which is recalled in previous chapters based on my memories on the day of the accident. The rest of the tape leaves me disappointed – I still can’t hear any of my foul-mouthed derision and scolding of the aircraft and its bad behaviour, but I certainly didn’t hold back.
My meeting with Sam is over, and I thank him and his team. Their report should be documented as an industry standard for its complexity, objectivity and attention to detail.
*
When I read the final ATSB report, I’m relieved to see this comment: ‘The flight crew’s responses to the warnings and cautions, the pitch-down events, and the consequences of the pitch-down events, demonstrated sound judgement and a professional approach.’ I didn’t discuss its inclusion with Sam.
There’s much more in this section. It justifies our decision to divert to Learmonth and our distrust in the failing operation of the many computer systems. There’s no criticism of the pilots, and I’m pleased to see a thumbs-up for the heroic actions of the cabin crew.
Case closed. Not for me. The images of the carnage and injuries remain imbedded in my memory.
Section 4 of the final report analyses the forces generated and summarises the injuries. This is the first time I’m aware of the serious nature and scope of the injuries experienced in the cabin.
It mentions that two passengers were seriously injured, even though their seatbelts were fastened: one was a child admitted to hospital with abdominal contusions; the other was an adult who experienced neck pains and was admitted to hospital three days later after having a stroke.
There was damage in some of the toilets, and I remember noticing this after we landed at Learmonth. One was completely unhinged during the pitch-downs while someone was using it. Imagine the injuries that this person must have endured after being slammed into the ceiling by the negative g, then being thrown to the floor with the dislodged toilet assembly under or on top of them.
Hidden at the bottom of one page is Footnote 199. It describes the serious injuries inflicted upon Fuzzy during the pitch-downs: ‘The flight attendant who was standing in the rear cabin impacted a handrail in the galley after hitting the ceiling, which exacerbated his injuries and also damaged the handrail.’ It’s an understatement to say, ‘and also damaged the handrail’. His body struck the stainless steel handrail with such force that it deformed the railing; this impact made him return to consciousness for an instant before his knees were slammed into the galley
floor.
Due to the uncertainty of further manoeuvres occurring, I was forced to keep everyone in their seats with seatbelts fastened. The cabin crew instructed passengers to clear the aisles of debris to prepare for the likelihood of an emergency evacuation once we landed. This is incredible initiative:
Prior to and after the second event, the cabin crew provided instructions to passengers to be seated and to keep their seat belts fastened. Some passengers requested medical assistance but the cabin crew advised them that they were unable to leave their seats. Some passengers provided medical attention to other passengers seated close to them, and cabin crew provided advice from their seats to some passengers about medical treatment. Towards the end of the flight, cabin crew and passengers in some areas cleared the aisles of debris and bags within their reach and without leaving their seats.
The report summarises the physical injuries sustained by the people in the cabin, but it doesn’t mention the enormous psychological stress they experienced for the fifty minutes we were airborne and preparing to land after the first dive.
Our situation makes me think of movie scenes where a soldier or a kidnap victim is kept in a high state of uncertainty as their captors perform mock executions with unloaded weapons held to their hooded heads. I feel this accurately represents the stresses experienced by everyone onboard during those fifty minutes.
I walked out, thankful for retaining my scalp, but I’m still burdened. We had so many factors working against us, and again I felt lucky to have survived.
I consider the QF72 accident to be one of the most catastrophic in modern aviation. All the automated systems, designed and installed for enhanced safety, failed. A complex system created a complex and destructive failure.