HRO 7 NAT and HRO
BLUF: Are accidents inevitable in complex technological systems? People seeking to understand high reliability should be familiar with Charles Perrow’s argument that accidents like Three Mile Island (TMI) are inevitable because of the nature of the technological system. The purpose of this post isn’t to refute Perrow’s ideas about Normal Accidents, but to identify them as the main counterpoint to HRO.
This post is a continuation of my thoughts on High Reliability Organizing. It picks up the thread from post 2.
The essence of High Reliability Organizing (HRO) theory is that organizations can be more reliable if they adopt mindsets oriented toward operations, personnel development, and active practices to proactively manage risk. As I noted in my previous post, HRO does not mean being “error free,” but seeks to reduce the impact of the errors on the overall system and increase the learning from them when they do occur.
* Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (1999). Organizing for high reliability: Processes of collective mindfulness. In R.S. Sutton and B.M. Staw (eds.), Research in Organizational Behavior, Volume 1 (pp. 81–123). Jai Press.
* Weick, K.E., Sutcliffe, K.M. (2007). Managing the unexpected: Assuring high performance in an age of complexity. Jossey-Bass.
Not all organizational researchers agree that people can manage complex, high-risk technology like nuclear power, aircraft carrier flight operations, and air traffic control to reduce the likelihood for system accidents. This is a serious issue for the social and political acceptability of certain kinds of technology. The principal alternative view is Normal Accident Theory (NAT) (Perrow, 1981, 1994).
* Perrow, C. (1981). Normal accident at three mile island. Society, 18(5), 17-26.
* Perrow, C. (1994). The limits of safety: the enhancement of a theory of accidents. Journal of Contingencies and Crisis Management, 2(4), 212-220.
In this post, I want to focus on the main ideas of NAT as a counterpoint to theorizing about HRO. In describing Normal Accidents, Charles Perrow applied a sociological perspective to technological risk. NAT goes beyond “errors happen” to argue that the inevitability of errors in certain technological systems that makes system accidents “normal.” System accidents are normal and should be expected in the sense that they are going to happen no matter what the humans in the system do. There is much about NAT that is controversial (Hopkins, 1999), but I leave its detailed study to the interested reader.
* Hopkins, A. (1999). The limits of normal accident theory. Safety Science, 32(2), 93-102.
System accidents are undesirable events involving unanticipated interactions among multiple failures in systems (Perrow, 1984, p. 70). Accidents seldom result from just one thing going wrong (Hollnagel, 2016). System accidents are fundamentally distinct from component failure accidents that involve the failure of one part of a system. The difference is a function of system design. In tightly-coupled technologies (nuclear power, aircraft, chemical processing, etc.), a failure in one part of the system can “jump” to another part in a way unimaginable to both designers and operators. Perrow didn’t define tight coupling, but suggested it resulted from many interactions between subsystems.
* Perrow, C. (1984). Normal accidents: Living with high risk technologies. Princeton University Press.
* Hollnagel, E. (2016). Barriers and accident prevention. Routledge.
Perrow argued that systems vary along two axes: types of interactions among system components (linear vs. complex) and coupling between the components or subsystems (loose or tight). Linear interactions occur between a component and others that immediately precede or follow it. There is little interaction among the subsystems. Such interactions are predictable from the design like the pistons and spark plugs in an engine.
Complex interactions occur between components not in the linear flow of production, either by intention or surprise. Complex interactions are often not immediately visible or comprehensible. The same kind of complex interactions occur in modern computer programs, which is why app problem solutions frequently start with “delete and reinstall.” This option is unavailable to nuclear power plants.
Systems with loose coupling evolve from one condition to another slowly, perceptively, and predictably according to their design. There are few connections between subsystems. Tight coupling involves many connections between subsystems, much of it unpredictable, opaque, and unrelated to the design. Tight coupling means that problems in one subsystem can more easily jump to other subsystems.
The problem with tightly-coupled, complex technology systems is not with their operators or any particular design detail. The fatal flaw resides in the organizational system needed to control them. In Perrow’s conception, organizations engaged in high risk, technologically complex work, are faced with an unresolvable organizational predicament. They need centralized authority and unquestioning compliance with procedures to manage tight coupling. At the same time, they need to be decentralized so operators can develop flexible responses to local conditions. Managing the complexity and volume of decisions would overwhelm any central authority. This organizational design dilemma centered on control is unresolvable according to Perrow.
In contrast to High Reliability Organizing theory that focuses on active management to reduce errors and their consequences, the core idea of NAT is that system accidents are inevitable in certain technologies because errors can’t be prevented. Perrow’s conclusion was that technologies like nuclear energy should be abandoned altogether because they can’t be managed to prevent potentially catastrophic disasters.
What makes NAT interesting to me is that it is based on the organizational design required by the technology. For Perrow, the management practices were irrelevant. The deeper you seek to understand HRO, the more you come to understand that organizations practicing it focus relentlessly on overcoming the limitations that Perrow saw as fixed. Some argue that NAT is incomplete precisely because of its focus on organizational design and sociology, not technical details. Perrow argued that nuclear power plants could only be managed with centralized control, but Bierly and Spender (1995) claimed that people in technological systems could use both centralized control for design and management practices and decentralized decision-making for operations.
* Bierly, P. E., & Spender, J. C. (1995). Culture and high reliability organizations: The case of the nuclear submarine. Journal of Management, 21(4), 639-656.
Organizational researchers and citizens in general can have sharply contrasting views about the value of NAT and its implications compared to HRO theorizing. The point of this post was not to claim the superiority of one theory over the other. My purpose was to introduce the core ideas of NAT because it is common to see arguments derived from it.
In my next post, I will move on from the possible inevitability of system accidents to discuss the concept of organizational roles and why they are important in HRO. Many organizations doing similar work have the same roles, but how they are performed, managed, and integrated to support the Big Three of HRO, learning, process discipline, and enabling the workers, makes an enormous the difference.