by James Reason
In the United Kingdom, where the Central Electricity Generating Board (CEGB) is planning to introduce about six pressurized water reactors (PWRs) by the early years of the twenty-first century, the government and the utility have gone to considerable lengths to persuade the public of both the economic necessity and the safety of these new reactors. The public inquiry held to decide whether or not to give permission for the building of the first of these PWRs at Sizewell ran for well over 2 years (340 working days) and was the longest of its kind held in Britain. Over a year was devoted to safety issues. Despite the decision to go ahead with the Sizewell ‘B’ reactor and the length of these debates, the public inquiry into the CEGB’s proposal to build a further PWR at Hinkley Point has attracted formal opposition from 19 local authorities, 250 organisations and over 20,000 individuals.
The second reason for the close connection between reliability analysis and the nuclear industry has to do with the fact that, in order to obtain both public acceptance and an operating licence, utilities have to demonstrate in advance that their reactor designs will satisfy certain safety targets. In Britain, for example, these are expressed as order-of-magnitude probabilities: less than one in a million per reactor year for a large uncontrolled release of radioacitivity into the environment, and a frequency of less than 1 in 10,000 per reactor year for a limited emission. As Rasmussen (1988) explains: “For hazardous large-scale installations, design cannot be based on experience gained from accidents, as it has been the case for accidents in minor separate systems. . . . The days of extensive pilot plant tests for demonstration of the feasibility of a design are over and the safety target has to be assessed by analytical means based on empirical data from incidents and near misses.” One of the consequences of this shift of focus has been the rapid development over the past 20 years of a branch of reliability engineering known as probabilistic risk assessment (PRA). Since many of the developments in human reliability analysis have been designed to contribute to these overall plant risk assessments, we need to look more closely at the nature and underlying assumptions of PRA.
1. Probabilistic risk assessment
At the heart of PRA are logical ‘tree’ models of the plant and its functions. These trees take two basic forms: (a) fault trees that address the question: How can a given plant failure occur (e.g., a serious release of radioactive material)? and (b) event trees that answer the question: What could happen if a given fault or event occurs (e.g., a steam generator tube rupture or small loss of coolant accident)? In the case of a fault tree the starting point is usually a gross system failure (the top event) and the causes are then traced back through a series of logical AND/OR gates to the possible initiating faults. An event tree begins with an initiating fault or event and works forward in time considering the probabilities of failure of each of the safety systems that stand between the initial malfunction and some unacceptable outcome (see Figure 8.1).
PRA has thus two aims, first, to identify potential areas of significant risk and indicate how improvements can be made; second, to quantify the overall risk from a potentially hazardous plant.
The general structure of a PRA was established in 1975 with the publication of the U.S. Reactor Safety Study, a 10-kilogram document known as WASH-1400 and formally titled: An Assessment of Accident Risks in U.S. Commercial Nuclear Power Plants. In outline, PRA involves the following procedural steps:
(a) Identify the sources of potential hazard. In the case of a nuclear power plant, the major hazard is the release of radioactivity from a degraded core.
(b) Identify the initiating events that could lead to this hazard.
(c) Establish the possible sequences that could follow from various initiating events using event trees.
(d) Quantify each event sequence. This involves data or judgement about two things: (i) the frequency of the initiating event, and (ii) the probability of failure on demand of the relevant safety systems.
(e) Determine the overall plant risk. This will be a function of the frequency of all possible accident sequences and their consequences.
It should be noted that even in its purely engineering application (focusing only on hardware failures), this technique has been criticised on a number of grounds. The logic of event trees demands that only conditional probabilities should be used, allowing for the preceding components of an accident sequence. In practice, however, this conditionality is rarely recognised, and independence of events has normally been assumed. In short, PRAs have neglected the possibility of common-mode failures, something that is considerably enhanced by the presence of human beings at various stages in the design, installation, management, maintenance and operation of the system. Other problems arise from the need to quantify events. In many cases, the data on previous faults do not exist and so have to be estimated. Reliability data relating to component failure rates do not necessarily ‘travel well’ from one type of installation to another.
Figure 8.1. Standard event tree for a small loss of coolant accident (LOCA) in a pressurised water reactor (PWR). Sequences 2 to 7 describe logical combinations of defence-in-depth failure(s), each having
Nonetheless, the development of a standardised PRA was a major step forward in reliability engineering. In particular, its underlying logic provides an important adjunct to the design process, identifying those areas where redundant or diverse safety systems need to be installed in order to prevent the propagation of unacceptably likely accident sequences. Its major failing, however, and one that came to be widely acknowledged after the Three Mile Island accident in 1979, was its inability to accommodate adequately the substantial contribution made by human failures (and particularly mistakes) to the accident risk. This problem has been the stimulus for numerous attempts to convert human error rates into the numerical currency demanded by PRA. We will consider some of the more notable of these methods below under the general heading of Human Reliability Analysis or HRA techniques.
2. Human reliability analysis (HRA) techniques
A large number of these techniques have emerged within the last decade. Schurman and Banks (1984) reviewed nine models for predicting human error probabilities; Hannaman, Spurgin and Lukic (1984) identified ten methods (including two of their own) that have been developed for use in PRA studies; Senders, Moray and Smiley (1985) examined eight such models; and Williams (1985) compared the performance of some nine techniques. Although there are common elements in each list, none is precisely the same. Here we will focus on those techniques that are either frequently cited or that involve a relatively distinct approach. The descriptions of the methods given below draw heavily upon the excellent reviews by Hannaman, Spurgin and Lukic, (1984), Senders and coauthors (1985) and Embrey (1985, 1987), as well as upon original sources.
2.1. Technique for human error rate prediction (THERP)
THERP is a technique that has attracted superlatives of all kinds. It is probably the best known and most widely used means of providing human reliability data for PRA studies. It is also the most accessible from the practitioner’s point of view; its procedures and rationale are clearly described in a 300-plus page Handbook of Human Reliability Analysis with Emphasis on Nuclear Power Plant Applications (Swain & Guttmann, 1983), as well as in a number of other publications (Bell & Swain, 1983; Swain & Weston, 1988). It is also one of the oldest techniques: its origins go back to the early 1960s (Swain, 1963), and its present handbook form seeks to transfer the 30-year experience of its principal architect, Alan Swain, to subsequent generations of human reliability analysts. And, probably as the result of its extensive usage and the effectiveness of its dissemination, it has also been subject to more criticism than any other HRA method. Yet it is judged by some whose opinions are worth noting (Senders et al., 1985) as “probably the best of the techniques currently available.” These factors, together with its heroic aspirations, make it a technique worthy of close examination.
The basic assumption of THERP (as in most other decompositional approaches to HRA
) is that the operator’s actions can be regarded in the same light as the success or failure of a given pump or valve. As such, the reliability of the operator can be assessed in essentially the same way as an equipment item. The operator’s activities are broken down into task elements and substituted for equipment outputs in a more-or-less conventional reliability assessment, with adjustments to allow for the greater variability and interdependence of human performance.
The object of THERP is “to predict human error probabilities and to evaluate the degradation of a man-machine system likely to be caused by human errors alone or in connection with equipment functioning, operational procedures and practices, or other system and human characteristics that influence system behaviour” (Swain & Guttman, 1983). The procedural stages involved in applying the THERP technique accord very closely to those in a PRA, described in the previous section. There are four steps:
(a) Identify the system functions that may be influenced by human error.
(b) list and analyse the related human operations (i.e., perform a detailed task analysis).
(c) Estimate the relevant error probabilities using a combination of expert judgement and available data.
(d) Estimate the effects of human errors on the system failure events, a step that usually involves the integration of HRA with PRA. When used by designers, it has an additional iterative step that involves making changes to the system and then recalculating the probabilities in order to gauge the effects of these modifications.
The basic analytical tool is a form of event tree termed a probability tree diagram. In this, the limbs represent binary decision points in which correct or incorrect performance are the only available choices. Each limb represents a combination of human activities and the presumed influences upon these activities: the so-called performance shaping factors (PSFs). The event tree starts from some convenient point in the system and works forward in time. With the possible exception of the first branching, all the human task elements depicted by the tree limbs are conditional probabilities.
The performance shaping factors are, in effect, the major concession that THERP makes to the humanity of the operators. They are used to modify the nominal human error probabilities (HEPs) according to the analyst’s judgement of such factors as the work environment; the quality of the man-machine interface; the skills, experience, motivation and expectations of the individual operator; and the degree and type of the stresses likely to be present in various situations.
The core of THERP is contained in 27 tables of human error probabilities set out in Part IV of the 1983 handbook. The values given in the tables relate to nominal HEPs (the probability that when a given task element is performed, an error will occur). These numbers are generic values, based on expert opinion and data borrowed from activities analogous to those of NPP operators.
Each of these tables deals with particular errors associated with specific activities: for example, errors of commission in reading and recording quantitative information from unannunciated displays; selection errors in operating manual controls or locally-operated valves and so on. Each table is broken down into smaller task components and, for each component, two numerical values are usually given: the nominal HEP and either the error factor (the square root of the ratio of the upper to the lower uncertainty bounds), or the uncertainty bounds themselves (the upper and lower bounds of the given HEP, reflecting uncertainty of estimates). The upper and lower uncertainty bounds correspond to the ninety-fifth and fifth percentiles, respectively, in a lognomal scale of HEPs. As indicated earlier, the analyst is required to adjust the nominal error probability values according to his or her judgement of the effects of local performance shaping factors.
Earlier versions of THERP were widely criticised for their exclusive focus on behavioural error forms and for their corresponding neglect of mistakes such as misdiagnosis or selecting an inappropriate remedial strategy—exactly the kind of errors, in fact, that contributed so extensively to the TMI accident. More recently, Swain and his collaborators have sought to revise the original technique (Swain, 1976) so that it can accommodate diagnostic errors and other higher level ‘cognitive’ mistakes (Swain & Guttmann, 1983), and this process of revision is still continuing some five years after the publication of the handbook (Swain & Weston, 1988).
These latest revisions are interesting for two reasons. First, they mark a departure from nominal error probabilities, derived from expert judgement, which has been shown to be highly variable (Comer, Seaver, Stillwell & Gaddy, 1984). In his latest work, Swain favours time-dependent error frequencies based on simulator data in which NPP crews are given different types of abnormal events to deal with. These data show the time taken for each correct diagnosis and the number of control teams who fail to achieve a correct diagnosis. Second, they provide a basis for estimating the probabilities of different kinds of postevent misdiagnoses. With the exception of the confusion matrix technique (Potash, Stewart, Dietz, Lewis & Dougherty, 1981), these had hitherto not been differentiated, being commonly consigned to an ‘incorrect diagnosis’ category.
It is too early to say whether these modifications to the basic THERP technique will be sufficient to ward off earlier criticisms relating to both the unreliability of its underlying human performance numbers and THERP’s limited focus upon external error forms (i.e., errors of omission and commission). Although Swain and his colleagues have shown considerable willingness to shift from their exclusively behaviourist stance to one that embraces more cognitive elements, it is unlikely that these moves will be sufficient to appease their mentalist critics who demand a more theoretically-driven, top-down mode of error analysis and who remain deeply sceptical about assigning meaningful probabilities to anything but highly situation-specific slips of action. For all its technical sophistication, THERP remains an art form—exceedingly powerful when employed by people as experienced as Alan Swain and his immediate collaborators, but of more doubtful validity in the hands of others.
2.2. Time-reliability techniques
This section deals with a closely related set of techniques that are concerned with quantifying postaccident errors on the basis of time-reliability curves. The first of these was the model termed operator action trees (OATS).
When OATS was first developed in the early 1980s (Hall, Fragola and Wreathall, 1982; Wreathall, 1982), THERP was the only technique that had been used to quantify human error risks contributing to possible NPP accidents. In its early form, THERP focused primarily upon procedural errors (e.g., leaving manual valves in the wrong position) which occur prior to the onset of a reactor trip and that may either cause the event or result in the unavailability of some safety system. The architects of the OATS technique saw this as neglecting other important kinds of human error: those that occur after an accident sequence has been initiated. These they called cognitive errors because they have for the most part involved mistakes in higher-level cognitive processes, such as reasoning, diagnosis and strategy selection. Procedural and cognitive errors require very different analytical techniques for both modelling and quantification. OATS was devised specifically for dealing with operator errors during accident and abnormal conditions and is designed to provide error types and associated probabilities to be used in PRAs.
A detailed account of the OATS procedures is given in NUREG/CR-3010 (Hall et al., 1982) and by Wreathall (1982). In brief, the method employs a logic tree, the basic operator action-tree, that identifies the possible postaccident operator failure modes. Three types of cognitive error are identified:
(a) Failure to perceive that an event has occurred.
(b) Failure to diagnose the nature of the event and to identify the necessary remedial actions.
(c) Failure to implement those responses correctly and in a timely manner.
These errors are quantified by applying an analytical tool called the time-reliability curve, which describes the probability of failure as a function of the time interval between the moment at which t
he relevant warning signals are evident to when action should be taken to achieve successful recovery. Simple modifications are made to this time-reliability curve when the analyst judges that operators would be reluctant to take certain actions. The probabilities derived from these time-relaibility relationships represent the likelihood of successful action by a team of operators. The major input to the quantification curve is the time available for thinking: tT = to - tI - tA, where tT is the thinking interval, to is the overall time from the initiation of an accident sequence to the point by which actions have to be completed, tI is the time after initiation at which appropriate indications are given and tA is the time taken to carry out the planned actions. The basis for these parameters are reported by Wreathall (1982). In the absence of suitable field data, they share with THERP the fundamental problem of being ‘best guesses’, derived either from experts or extrapolated from laboratory studies.
Hannaman and his coauthors (1984, p. A-4) list the following ‘plus’ points for OATS: “it provides a defined structure for assessing the operator failure modes which is independent of procedures, it is simple to use with defined dependencies, it has an application guide, and defined data.” Senders and coauthors (1985, p. 44) comment: “OATS has not been formally validated, but it has been related to empirical data in a particularly interesting way by use of generic ‘time-to-completion’ curves. As a result, OATS can predict the probability that no response will yet have been made to the annunciation of an incident as a function of the time since the incident occurred. The resulting curves seem to provide a reasonable estimate of ‘speed accuracy trade-off in fault diagnosis when compared to (recently obtained) data.”