by James Reason
Management/Maintenance
Vent gas scrubber was in inactive mode.
Management
Iron pipelines were used for transporting MIC.
Management
A manual mechanism for switching off scrubber.
Design/Management
No regular cleaning of pipes and valves.
Maintenance/Management
No online monitor for MIC tanks.
Design
No indicator for monitoring position of valves in control room.
Design
Pressure monitor underreading by 30 Psig.
Design
Case Study No. 3: Challenger
Date
Actions and latent failures
1977
During test firings of solid – rocket booster, Thiokol engineers discover that casing joints expanded (instead of tightening as designed). Thiokol persuades NASA that this is “not desirable but acceptable.” It was also discovered that one of the two O – ring joint seals frequently became unseated, thus failing to provide the back – up for which it was designed.
1981
NASA plans two lightweight versions of the boosters in order to increase payload. One is to be of steel, the other made of carbon filament. Hercules submits an improved design for the latter, incorporating a lip at the joint to prevent the O – ring from unseating (termed a “capture feature”). Thiokol continues to use unmodified joints for its steel boosters.
November 1981
Erosion (or “scorching”) was noticed on one of the six primary O – rings. This was the same joint that was later involved in the Challenger disaster.
December 1982
As a result, NASA upgrades the criticality ratings on the joints to 1, meaning that the failure of this component could cause loss of both crew and spacecraft.
April 1983
Some NASA engineers seek to adapt the Hercules “capture feature” into the new thinner boosters. The proposal is shelved and the old joints continue to fly.
February 1984
Just prior to the 10th shuttle launch, high – pressure air tests are carried out on the booster joints. On return, an inch – long “scorch” found on one of the primary O – rings. Despite the “critical – 1” rating, Marshall Space Center reports that no remedial action is required. No connection noticed between high – pressure testing and “scorching”, although pinholes in the insulating putty were observed.
April 1984
On 11th flight, one of the primary O – rings is found to be breached altogether. This was still regarded as acceptable. No connection made between high – pressure air testing and scorching, even though the latter was found on 10 of the subsequent 14 shuttle flights.
January 1985
Breaches (“blowbys”) are found on four of the booster joints. Weather at launch coldest to date: 51 degr F with 53 degr F at the joints themselves. No connection noted.
April 1985
On the 17th shuttle mission, the primary O – ring in the nozzle joint fails to seal. Scorching found all the way round the joint.
July 1985
After another flight with three blowbys, NASA booster project manager places a launch constraint on the entire shuttle system. This means that no launch can take place if there are any worries about a Criticality – 1 item. But waivers may be granted if it is thought that the problem will not occur in flight. Waivers are granted thereafter. Since top NASA management were unaware of the constraint, the waivers are not queried.
July 1985
Marshall and Thiokol engineers order 72 of the new steel casing segments with the capture features.
July 1985
Thiokol engineer writes memo warning of catastrophe if a blowout should occur in a field joint.
August 1985
Marshall and Thiokol engineers meet in Washington to discuss blowbys. Senior NASA manager misses meeting. Subsequently, 43 joint improvements ordered.
December 1985
Director of the solid rocket motor project at Thiokol urges “close out” on the O – ring problem (i.e., it should be ignored) on the grounds that new designs were on their way, and the difficulties were being worked on. But these solutions would not be ready for some time.
January 23 1986
Five days before the accident, the entry “Problem is considered closed” is placed in a NASA document called the Marshall Problem Reports.
January 27 1986
It is thought probable that, on the night before the launch, the temperature would fall into the twenties, some 15 degr F colder than the previous coldest launch a year earlier. (The actual launch temperature was 36 degr F, having risen from 24 degr F.) At this point, Allan McDonald, Thiokol’s chief engineer at the Kennedy Space Center (the “close out” man) experiences a change of heart and attempts to stop the launch.
January 28 1986
The Challenger shuttle is launched and explodes seconds after, killing all seven crew members. A blowout occurred on one of the primary booster O – rings.
Case Study No. 4: Chernobyl
Chain of events and active failures
Contributing conditions and latent failures
At 1300 on 25 April 1986, power reduction starts with the intention of achieving test conditions. The tests are to be carried out at 25 per cent full power (in the 700MW range). They are to be conducted in Unit 4, sharing common facilities with Unit 3.
The test was to see whether the “coast – down” capacity of a turbine generator would be sufficient, given an appropriate voltage generator, to power the Emergency Core Cooling System (ECCS) for a few minutes. This would fill the time it took to get the diesel standby generators into operation.
A voltage generator had been tested on two previous occasions, but had failed because of rapid voltage fall – off. The goal on this occasion was to carry out repeated testing just prior to the annual maintenance shutdown, scheduled to begin on the following Tuesday.
According to Russian sources, the quality of the test plan was “poor and the section on safety measures had been drafted in a purely formal way.” In addition, the test plan called for shutting off the ECCS for the entire test period (about 4 hours). Authority to proceed was given to station staff without the formal approval of the Safety Technical Group. In addition, there is some evidence that three other RBMK plants (at Leningrad, Kursk and Smolensk) had refused to carry out these tests on safety grounds.
The principal testers were electrical engineers from Moscow. The man in charge, an electrical engineer, “was not a specialist in reactor plants” (Russian report).
(Institutional and managerial errors and violations)
At 1400, the ECCS is disconnected from the primary circuit.
This was part of the test plan, but it stripped the plant of one of its main defences.
(Managerial failure)
At 1405, Kiev controller asks Unit 4 to continue supplying grid. The ECCS is not reconnected.
Although this failure to reconnect the ECCS did not contribute directly to the subsequent disaster, it was indicative of a lax attitude on the part of the operators toward the observance of safety procedures. Subsequent 9 hours of operating at around 50 per cent full power increased xenon poisoning, making plant more difficult to control at low power.
(Managerial and design failures)
0028: Having been released from the grid at 2310, operators continue power reduction. But operator omits entry of “hold power” order; this leads to very low power.
The design of the RBMK reactor renders it liable to positive void coefficient at power settings below 20 per cent full power. After a long struggle, reactor power was stabilised at 7 per cent full power.
At this point, the test should have been abandoned in view of the dangerously low power setting. Russian comment: The staff was insufficiently familiar with the special features of the technological processes in a nuclear reactor.” They had also “lost any feeling for the hazards inv
olved.”
(Managerial, design and operational failures)
Operators and engineers continue to improvise in an unfamiliar and increasingly unstable regime to protect test plan. Rant goes super prompt critical. Explosions occur at 0124.
To ensure the continuance of the test, the operators and engineers gradually strip the reactor of its remaining defences. By 0122, the core had only 6 to 8 control rods inserted. An attempt to ‘scram’ the reactor at 0124 fails. Prompt criticality is now irreversible.
(Managerial, design and operational failures)
Case Study No. 5: Herald of Free Enterprise
Chain of events and active failures
Contributing conditions and latent failures
Herald is docked at No. 12 berth in Zeebrugge’s inner harbour and is loading passengers and vehicles before making the crossing to Dover.
This berth is not capable of loading both car decks (E and G) at the same time, having only a single ramp. Due to high water spring tides, the ramp could not be elevated sufficiently to reach E deck. To achieve this, it was necessary to trim the ship nosedown by filling trim ballast tanks Nos. 14 and 3. Normal practice was to start filling No. 14 tank 2 hours before arrival.
(System failure)
At 1805 on 6 March 1987, the Herald goes astern from the berth, turns to starboard, and proceeds to sea with both her inner and outer bow doors fully open.
The most immediate cause is that the assistant bosun (whose job it was to close the doors) was asleep in his cabin, having just been relieved from maintenance and cleaning duties.
(Supervisory failure and unsuitable rostening)
The bosun, his immediate superior, was the last man to leave G deck. He noticed that the bow doors were still open, but did not close them, since he did not see that as part of his duties.
(Management failure)
Chief officer checks that there are no passengers on G deck, and thinks he sees assistant bosun going to close doors (though testimony is confused on this point).
The chief officer, responsible for ensuring door closure, was also required (by company orders) to be on the bridge 15 minutes before sailing time.
(Management failure)
Because of delays at Dover, there was great pressure on crews to sail early. Memo from operations manager: “put pressure on your first officer if you dont think he’s moving fast enough...sailing late out of Zeebrugge isn’t on. It’s 15 minutes early for us.”
(Management failure)
Company standing orders (ambiguously worded) appear to call for “negative reporting” only. If not told otherwise, the master should assume that all is well. Chief officer did not make a report, nor did the master ask him for one.
(Management failure)
On leaving harbour, master increases speed. Water enters open bow doors and foods into G deck. At around 1827, Herald capsizes to port.
Despite repeated requests from the masters to the management, no bow door indicators were available on the bridge, and the master was unaware that he had sailed with bow doors open. Estimated cost of indicators was £400 – 500.
(Management failure)
Ship had chronic list to port.
(Management and technical failure)
Scuppers inadequate to void water from flooded G deck.
(Design and maintenance failure)
Top – heavy design of the Herald and other “ro ro” ships in its class was inherently unsafe.
(Design failure)
Case Study No.6: King’s Cross Underground fire
Chain of events and active failures
Contributing conditions and latent failures
At 1925 on 18 November 1987, discarded smoker’s material (probably) sets fire to grease and detritus in right hand running track of escalator 4 (up) Piccadilly Line.
Wooden escalator installed in 1939. Long recognised as being especially fire – prone. Water fog equipment installed in 1948. Could not be used nightly because of rust problems. Smoke detectors not installed: expense not justified.
Forty – five per cent of the 400 fires recorded on London Underground over previous 20 years had occurred on MH escalators.
Running tracks not regularly cleaned, partly due to organisational changes which blurred maintenance and cleaning responsibilities. Safely specialists scattered over three directorates focused on operational or occupational safety. Passenger safety neglected.
Railway Inspectorate took a blinkered view of their role. They did not pursue issues of fire protection. Judged as having “too cosy” a relationship with London Underground.
Smoking permitted on London Underground trains and premises.
(Hardware, organisational and regulatory failures)
At 1930, passenger alerts booking clerk to small fire on escalator 4. Booking clerk rings Relief Station Inspector (RSI), but does not specify precise location of fire.
Inadequate fire and emergency training given to staff. It was accepted by LU that the quality of staff training at its White City training centre had been inadequate. Only 4 of the 21 station staff on duty had had any training in evacuation or fire drills.
(Management failure)
At 1934, railway police evacuate passengers via Victoria Line escalator. They are unaware of the layout of the station.
No evacuation plan existed for King’s Cross underground station. No joint exercises between LU and the emergency services had been conducted.
(Management failure)
Between 1935 to 1938, RSI enters lower machine room, but fails to detect fire. He enters upper machine room and sees smoke and flames. Fetches fire extinguisher, but cannot get close enough to use it. He is too preoccupied to activate water fog equipment.
Inadequate training. RSI was not regularly based at King’s Cross, nor did he have any fire training. He had not so far informed either the station manager (located some distance away due to refurbishment of station) or the line controller of the fire. Trains were still arriving.
Location of water fog equipment not widely known.
(Management and communication failures)
At 1939, police in ticket hall decide to evacuate the area. At 1940, police officer asks for Piccadilly and victoria Line trains to be ordered not to stop at King’s Cross. Trains continue to stop. At 1941, metal gates to ticket hall closed by police officers.
At 1942, first fire engines arrive. Two firemen examine fire on escalator.
At 1945, flashover occurs. Whole ticket hall engulfed in intense heat and flame. Thirty – one people are killed, many others are seriously injured.
No established evacuation plan.
Locked doors and metal barriers blocked escape routes
LU control rooms last modernised in the 1960s. Outdated communications equipment.
Headquarters controller had no access to station public address system, which was not used during the emergency.
5 of the 8 TV monitors were either switched off or inoperable.
Most cameras were out of service.
Trains do not have a public address system.
No public telephones at King’s Cross tube station.
(Management, hardware, maintenance and communication failures)
Fires (“smoulderings”) regarded as inevitable occurrence on LU.
‘They are part of the nature of the oldest, most extensive, most complex underground railway in the world. Anyone who believes that it is possible to act as though there are no fires ever is, I fear, misguided” (Dr Ridley, then Chairman of London Underground).
(Management, system and organisational failures)
References
* * *
Abelson, R.P. Script processing in attitude formation and decision making. In J. Carroll & J. Payne (Eds.), Cognition and Social Behavior. Hillsdale, N J.: Erlbaum, 1976.
Adelson, B. Knowledge structures of computer programmers. Proceedings of the Fourth Annual Meeting of the Cognitive Science Society, 1981, 4, 243-248.
Adel
son, B. When novices surpass experts: The difficulty of a task may increase with expertise. Journal of Experimental Psychology Learning, Memory and Cognition, 1984, 10, 483-495.
Alba, J.W., & Hasher, L. Is memory schematic? Psychological Bulletin, 1983, 93, 203-231.
Allport, D.A. Patterns and actions: Cognitive mechanisms are content-specific. In G. Claxton (Ed.), Cognitive Psychology: New Directions. London: Routledge & Kegan Paul, 1980(a).
Allport, D.A. Attention and performance. In G. Claxton (Ed.), Cognitive Psychology: New Directions. London: Routledge & Kegan Paul, 1980(b).
Allport, D. A., Antonis, B., & Reynolds, P. On the division of attention: A disproof of the single channel hypothesis. Quarterly Journal of Experimental Psychology, 1972, 24, 225-235.
Allwood, C.M. Error detection processes in statistical problem solving. Cognitive Science, 1984, 8, 413-437.
Allwood, C.M., & Montgomery, H. Knowledge and technique in statistical problem solving. European Journal of Science Education, 1981, 3, 431-450.
Allwood, C.M., & Montgomery, H. Detection of errors in statistical problem solving. Scandinavian Journal of Psychology, 1982, 23, 131-139.
Anderson, J.R. The Architecture of Cognition. Cambridge, Mass.: Harvard University Press, 1983.
Arbuckle, T.Y., & Cuddy, L.I. Discrimination of item strength at time of presentation. Journal of Experimental Psychology, 1969, 81, 126-131.
Atkinson, R.C., & Juola, J.F. Factors influencing speed and accuracy of word recognition. In S. Korablum (Ed.), Attention and Performance (vol. IV). New York: Academic Press, 1973.