No Man's Land

Home > Other > No Man's Land > Page 17
No Man's Land Page 17

by Kevin Sullivan


  Examination of the data recorded by the aircraft’s black boxes has identified abnormal data production and output from ADIRU 1. The investigators are now able to match the failures with a time marker. They can also identify the failure sequence caused by the erroneous data generated by the air-data computer and its domino effect on the computers and automated systems.

  Of significance is the data recorded from ADIRU 1 regarding angle of attack. This raw value is captured by each of the three metal probes on the sides of the aircraft below the cockpit. Simplistically, it’s a measure of the aircraft’s body angle relative to the air mass, but really, it’s an angular measurement of the aircraft’s wing relative to the air mass. In cruise, an aircraft has an angle-of-attack value of +2 degrees, and this was the recorded value before the autopilot disconnected at 1240 (Perth time). The angle-of-attack values recorded after that were the ‘spikes’ mentioned previously, and their peak values were +50.6 degrees.

  Angle of attack is a critical value for a fly-by-wire aircraft. The computers responsible for keeping the aircraft safe need to know where the aircraft is inside the air mass it’s travelling through. The air-data values generated inside the three air-data computers provide this situational awareness to the flight-control computers. Each air-data computer produces and transmits its angle-of-attack value sixteen times per second to each of the flight-control computers, and that data is recorded each second by the aircraft’s black-box recorders. A lot can go wrong within those one-second recorded points.

  The recorded angle-of-attack spikes of +50.6 degrees are huge values. Even high-performance military aircraft don’t operate in these regimes, except under extreme manoeuvring, and there has to be something seriously wrong with ADIRU 1 for these values to be generated internally. A value of around +2 degrees would have been measured by the aircraft’s angle-of-attack probes and sent to the air-data computers. Somehow, ADIRU 1 took that routine value and magnified it to +50.6 degrees before distributing it to the three flight-control computers. This extreme value exceeds the wing’s stall angle: the angle where the air over the wing separates and the wing’s lifting force is lost.

  The ATSB has determined that these spikes in angle-of-attack generated the stall warnings – of which forty-two were recorded, but there could have been more due to the black box’s data-capture recording limits of one second. There were more, I can confirm that.

  Commercial pilots aren’t able to monitor angle-of-attack values through any available instrumentation, so we’re oblivious to the values going into and coming out of the ADIRUs. But navy pilots are trained to use angle of attack to set various performance values. I’ve long held the opinion that angle of attack is a useful performance tool in commercial aviation; however, it’s only being embraced in newer-generation commercial transport designs.

  Had I been alerted, through appropriate instrumentation, that the aircraft was generating random angle-of-attack values of +50 degrees, perhaps I would have been able to address that scenario more effectively.

  The stall and overspeed warnings generated by corrupted data from ADIRU 1, though found to be spurious, were continuous and unable to be silenced. These warnings are serious and require immediate pilot action if they’re valid, so their activation was an enormous distraction to the flight crew while we attempted to control the aircraft, gather information and generate a landing strategy.

  The preliminary report also comments on the potential for electromagnetic interference, from either personal electronic devices used by the passengers or from the Naval Communication Centre Harold E. Holt. This is spooky stuff.

  It’s painfully obvious to me that these many failures are extremely complex and outside the realm of the Airbus training course or technical information provided at the time. How can an aircraft be certified as safe if it its design allows this type of dangerous behaviour? How can the ADIRU corrupt and produce extreme data from benign values, and how can the flight computers use that data aggressively?

  The preliminary report confirms to me that this accident was a science-fiction scenario that became a reality: automation gone haywire. The humans, as the last line of defence, were ill-equipped to stop it. There is no disconnect, no fail-safe switch, installed to override this type of computer malfunction from occurring on an Airbus.

  I use the film 2001: A Space Odyssey to understand and accept what happened on the flight. I was Dave, negotiating with the sentient computer HAL, to ‘open the pod bay doors’. My commands to arrest the dives were overruled: ‘I’m sorry, Dave. I’m afraid I can’t do that.’ The aircraft wouldn’t let me stop it from protecting itself.

  23.

  Another Airbus is lost in the month following QF72.

  On 27 November 2008, an Airbus A320 crashes near Perpignan, France, in the course of completing a test flight. Air New Zealand had leased the aircraft to XL Airways Germany, and it was being returned. The pilots somehow lose control of the aircraft on approach. All seven crew members perish.

  A photo shows the vertical tail submerged in shallow water, emblazoned with the Air New Zealand logo, as news of the tragedy is reported. I wonder: Could this accident have similarities to QF72? How could the Airbus design, with its infallible protections, allow this type of loss-of-control event to occur? By now, I’m seriously questioning the safety of the Airbus design, and it troubles me greatly that an Airbus has gone down so quickly after my own event.

  The interim report, released in February 2009, points to pilot error as the cause of the crash. I examine the data and the graphs at the end of report, noticing that both angle-of-attack sensors appeared to have failed early in the flight, but no mention of this is made in the interim report. Even when the aircraft had departed controlled flight, it showed a constant low angle-of-attack value. I’m surprised and baffled by this omission, but most reports from this European states investigative agency are quick to point the finger at pilots in any accidents involving Airbus.

  The following year, the final report pinpoints water intrusion into the angle-of-attack mechanisms after the aircraft was washed and the sensor area hadn’t been covered correctly. This water froze early in the flight at higher altitude, also freezing the angle of attack at a constant value, which was then sent to the flight-control computers. Sadly, there was an element of pilot error involved, but the frozen angle of attack played a part.

  Media around the world are quick to question if this accident has similarities to QF72. The resounding answer from ‘aviation experts’ is ‘no’. But I believe they’re similar. Air data was ‘corrupted’ through the frozen angle-of-attack probes, and the aircraft used that corrupted data. The electronic flight-control computers didn’t alert the pilots to an anomaly with the angle-of-attack value. The management computers responsible for displaying airspeed and altitude to the pilots are also responsible for displaying the angle-of-attack limit on the pilot’s primary display. The frozen angle-of-attack value subtly modified the limit value displayed on the pilot’s airspeed indication.

  The pilots made a mistake; they rushed their protection check at low altitude, and they didn’t calculate the indicated speed at which this protection should activate in accordance with their checklist procedures. This combination of corrupted air data and pilot confusion generated a loss-of-control event. In the course of their attempts to recover the aircraft, the flight computers subtly reverted to the basic level of flight protection, called Direct Law. This allowed the pilots to stall the aircraft and presented a complicated recovery situation that they weren’t able to accomplish.

  Analysts have started to quote a new term for pilot error: ‘startle factor’. In my opinion, this factor is subliminally built into the Airbus design. We were certainly served a huge dose of it.

  The sensor probes are subjected to the extremes of the air mass in which they’re required to operate. They can be blocked, frozen or damaged, but they will still send a raw signal to the ADIRU, meaning a corrupted value of air data will be displayed to th
e pilots. It can be a complicated task for a pilot to effectively recognise unreliable speed, angle of attack or temperature, given that the aircraft might be experiencing bad weather, turbulence or systems failures.

  If the Airbus is so smart, why doesn’t the automation provide this recognition function? Why depend on the pilots to recognise this degradation in air data and potentially get it wrong? The QF72 crew was presented with a similar scenario, but we survived. It’s a scenario that becomes repetitive in the accidents and incidents on Airbus aircraft in the years to come.

  *

  By now, VH-QPA has been repaired and tested, and is now ready to be reintroduced into normal operations.

  Then another automation event occurs.

  On 27 December 2008, a Qantas Airbus A330-303 aircraft, registered VH-QPG, is flying from Perth to Singapore as QF71. While the aircraft is in cruise at 36,000 feet, the autopilot disconnects and the crew receives an alert of a failure in ADIRU 1. This computer is the same model as that involved on QF72.

  Previously, Airbus had produced a new procedure to address non-normal ADIRU operation, which didn’t exist prior to QF72. It involves procedures to isolate either the air data or navigation portions of that computer, to eventually shutting down the computer completely for a given set of circumstances. It was in the third iteration at this time and would change again as more testing and operational feedback occurred.

  The crew of QF71 are prepared, as most Airbus crews are at this time, with Version 3 of the procedure pre-positioned on the instrument glareshield, and they quickly execute the necessary steps before the aircraft’s computers can use the corrupted data. Their prompt actions prevent computer-generated manoeuvres, but they keep receiving warning and caution messages that constantly scroll on the display and can’t be silenced. The crew elects to return to Perth and an uneventful overweight landing is conducted.

  After the QF71 event, the operational procedure is further modified to Version 4 in January 2009.

  Sitting on the sidelines, I quietly pat myself on the back for taking time off from work while all these procedural changes from the manufacturer are modified and published. That could have been me operating as the captain of this flight. What happened on QF71 shows there are still issues with the Northrop Grumman ADIRU and the Airbus software.

  *

  In February 2009, the chief ATSB investigator notifies the QF72 pilots of the impending publication of an interim report. Prior to its release, we’re provided with a copy to comment on. The report builds on the foundations of the preliminary report, doing more to analyse the technical complexity of the failures of the Airbus electronic flight-control system and its operational partnership with the Northrop Grumman air-data computers.

  On 6 March 2009, the ATSB publishes the interim factual report. Fifty-three pages long, it features updates on the number and types of injuries, seatbelt and cabin safety issues, and ongoing ADIRU testing. Of significance, there are more answers on why the aircraft computers behaved so aggressively. Through the partnership with Airbus, the data spikes have been re-created by their massive systems computer, nicknamed the ‘Iron Bird’. According to Airbus, the Iron Bird

  is an engineering tool used to design, integrate, optimise and validate vital aircraft systems. Since all aircraft systems are controlled from the flight deck, the Iron Bird needs a cockpit for its control. Three Fixed Based Simulators (FBS) are used along with a mobile visual system. After certification and when the aircraft is in revenue service, the Iron Bird is used for further development of the aircraft systems as well as a test bench to trace anomalies that may show up with components or systems.

  Airbus successfully simulated the data spikes and found that the installed software in their flight-control computers, designed to filter these spikes, was flawed. The software algorithm allowed the flight computers to accept the extreme angle-of-attack values as real, and therefore ordered the two pitch-downs. This discovery identifies a rare and unforeseen scenario that wasn’t considered in its design or during its certification testing. This scenario lay dormant in the spaghetti of all the computer software until the perfect storm of QF72 generated the opportunity for it to fail.

  I begin to wonder how many more flawed algorithms are potentially lying in wait within the software.

  The report then describes what the computers commanded to cause the dives and injuries. The extreme angle-of-attack and other air-data values data produced by ADIRU 1 fooled the flight computers into accepting the aircraft was in danger. They simultaneously activated two protections to keep the aircraft safe.

  Unlike the pilot, the flight computers have direct control over the aircraft’s elevator control. This is the control surface on the tail that moves the aircraft up or down; the computers can move it instantaneously and precisely if required. A pilot must request this movement through either the sidestick or the autopilot, and extreme requests are filtered by the software to maintain the aircraft in its safety box of operation.

  The two protections ordered by the computers each have a value of tail movement, producing a downward flight path. The first protection, ‘High Angle-of-Attack Protection’, is worth four degrees of elevator movement. The second protection, ‘Anti Pitch-Up Compensation’, is worth six degrees of elevator movement. It’s simple arithmetic: 4 + 6 = 10. That’s close to the maximum downward tail movement of 14 degrees. On QF72, the computers combined the two protections to generate 10 degrees of tail movement. The aircraft’s attitude was +2 degrees above the horizon while in cruise. The computers’ command pushed the nose down to -8.4 degrees below the horizon – within only two seconds. At 37,000 feet and at the fast cruise speed we were flying, this pitch rate generated an enormous downward force.

  The pilots, the last line of defence, were preordained by the Airbus design to be placed in a powerless position to stop those erroneous modes from activating without an instinctive control or override built into the design. When protections are active, the pilot’s sidestick commands are ignored. The nose movement looks benign on the ATSB website animation, but the elevator was moved aggressively to near maximum deflection within those two seconds. A human pilot can’t physically perform this precise pitch-down manoeuvre as effectively and violently as the computers can.

  I still can’t understand how the software allowed two protections to be added together and generate the same movement at this high altitude as it would at a lower one.

  The report classifies what happened on QF72 as an ‘accident’. The a-word: one that airline operators dread and despise. Of course, Qantas is known for its impeccable safety record, globally – even in Hollywood. ‘Qantas never had a crash,’ as Dustin Hoffman’s character proclaims in Rain Man.

  The ATSB explains the reason for this accident classification, a result of the many serious injuries experienced on the flight.

  Qantas, like many other airlines, has experienced some serious events in its recent operating history. However, if there are no serious injuries and if the airline can repair the damage and return the aircraft to operation – as it did after the QF30 and QF32 events – the event is downgraded from an ‘accident’ to an ‘incident’. This preserves an airline’s ‘accident-free’ record.

  VH-QPA sustained substantial damage to the interior of the aircraft, but it was repaired. Sadly, however, there would be no quick fix for those seriously injured on QF72.

  This accident had nothing to do with the airline, and yet it would be stuck with an accident in its safety history.

  The ATSB interim report has identified the root causes of the pitch-downs and the errant generation of air data, but all the ‘whys’ are yet to be determined. After reading this description of the pitch-downs, and learning that the flawed software is dormant, I again question – as do my colleagues – why the Airbus fleet hasn’t been grounded.

  The investigation has a long way to go. It’s April 2009, and I have some big decisions to make.

  The name of a song by the Clash keeps going
through my mind as I contemplate my future: ‘Should I stay or should I go?’

  24.

  In most airlines there’s a seniority system. On your first day at work, you’re assigned a number. This determines which aircraft type you fly, along with leave periods and the base city you’ll fly from. As you progress in an airline career, your opportunities to advance and fly the long-range aircraft, with more salary and overtime pay rates. Your position relative to your colleagues’ remains constant, determined by your seniority number.

  Being a captain flying the Airbus A330 is the highest position my number allows. The other non-Airbus fleets are either being deactivated (767) or aren’t accepting new positions (747-400). Nevertheless, in May 2009, I decide to return to work.

  I’ve thought seriously about stopping permanently, but I won’t let go of my lifelong passion without a fight. The fight, I’ve realised, is with myself. I don’t think anyone will think less of me if I don’t return to flying. Many of my colleagues appreciate the severity of my ordeal, along with the price my crew and I have paid to ‘save the day’. Whenever I’m asked about what happened, I struggle to relate the challenges we faced – my voice cracks, my face flushes, and my speech becomes louder and more animated as I summarise the flight and the frustrations afterwards.

  In fact, eighteen months pass before I can tell the whole story without my voice cracking. This confirms to me that I’ve sustained some permanent damage, but I won’t give in. I need to look at myself in the mirror and into my daughter’s eyes, and proclaim that I did not quit. I won’t hide my damage, nor will I let it stop me from trying to reclaim my life and my career.

  Vince Lombardi is a famous American football coach for the Green Bay Packers, who’s widely recognised for his inspirational quotes. ‘It’s not whether you get knocked down, it’s whether you get up,’ he says.

 

‹ Prev