Safety Management Systems for Aviation Practitioners: Real-World Lessons
Hollinger, Kent. Reston, Virginia, U.S.: American Institute of Aeronautics and Astronautics, 2013. 221 pp. Figures, tables, references, appendixes, index. Hardcover.
Kent Hollinger’s account of safety management systems (SMS) is presented in an unusual and possibly unique format. It approximates a classroom experience of the kind led by Hollinger on behalf of The MITRE Corp., a not-for-profit company that operates research and development centers funded by the U.S. government.
As in an actual interactive teaching situation, the text incorporates dialogue involving Hollinger and the students in one of MITRE’s five-day SMS classes.
“This book is specifically intended to avoid an academic approach,” the author says. It is written for “practitioners, those people on the front lines who will benefit from, and interact with, SMS every day. SMS principles are introduced to explain and give context to the concepts, but the emphasis is on actual usage and examples.”
The 12 students quoted, given fictional names, represent a cross section of aviation personnel. Among their functions are cabin crewmember, pilot, safety director, maintenance manager, operations manager, safety office manager and senior inspector. Hollinger leads the class but also encourages the students to discuss their own thoughts and experiences.
The book is arranged according to modules like those of the class: an introduction; the SMS “table” (a pictorial representation of the system elements and how they relate to one another); the business case for SMS; an SMS look at human error; positive safety culture; SMS requirements and standards; SMS policy; SMS management structure; safety risk management; safety assurance; safety promotion; and next steps. An appendix summarizes key points.
In setting the stage for the discussions that follow, Hollinger cites David Marx’s Whack-a-Mole: The Price We Pay for Expecting Perfection.1 Hollinger says, “Has anyone here not made an error yet today? No one? It is to be hoped that you recognized your mistake and corrected it before anything bad happened. Humans will always make errors, no matter how hard they try to do the right thing.
“So, if we have a system that relies on everyone doing everything perfectly every time or else it falls apart, that’s not a very good system, is it? … We need systems that are designed so that the chance for errors is reduced, those errors that do occur are captured before creating a bad result, and the systems are tolerant of those errors that are not captured.”
As an example of the discussion format, here is an exchange about hazard identification and tracking:
Kent (Hollinger): If we have a hazard that poses a low risk, why would we want to track that hazard?
Hans: It might trend upward in the future.
Kent: And in that case, it might present a high risk. We will discuss risk analysis in Module 10, but it involves looking at the severity and the likelihood of a consequence (or outcome) arising from a hazard. Severity means the degree of harm posed by the outcome, whereas the likelihood (or probability) is how often it would occur. Of those two dimensions of risk, studies have shown that people are good at estimating one and not so good at the other. Which one do you think we are good at — estimating the probability of something happening or estimating the severity if it did happen?
Derek: Severity. People can always envision what might happen, but they usually don’t have enough information to accurately predict the probability. That is why they play the lottery.
Kent: Exactly. There are 14 of us in this class. Let’s pretend we all work for the same company. What if something happened and each one of us knew about one different occasion of this thing happening in the past year? If someone doing a risk analysis asked us, “How often does this event happen at your company?” I would say, “Once a year.” You would say, “Once a year” and the rest of us would say, “Once a year,” but it really occurs 14 times a year. Is that a different risk exposure if it’s happening 14 times a year instead of once a year? This is another benefit of a centralized safety database, because if we had 14 different data storage locations, each one might know about it happening once and we would underestimate our exposure.
The following is another example (abbreviated) of the dialogue format in Hollinger’s classes.
Kent: If I asked you to describe the weather, what indicators would you use?
Ali: Wind speed.
Linda: Wind direction.
Kent: There are many indicators to describe the weather and there are many indicators to describe safety. Safety targets are the indicator values that we want to achieve. … The state might say, “In three years, we want to reduce runway incursions to a rate of not more than 0.5 per million operations.”
To achieve the target, the state could create an action plan to install surface movement radar systems at the three largest hub airports within the next 12 months, with a 98 percent availability rate. … If there were zero incursions at the three largest hub airports, and the national rate would only reduce to 0.7 per million operations, perhaps the radar system should be installed at more airports.
If the state were able to achieve this target and reduce runway incursion, does that mean it has a safe airspace system?
Ali: It’s safer.
Kent: Yes, but is it safe? If there were zero runway incursions, would the aviation system be safe?
Derek: No, there might be a midair collision every day.
Kent: Exactly. The point here is that just one indicator is not sufficient. Just like in describing the weather, it takes numerous indicators, along with their targets, to know if we have a safe system or organization. The indicators can be very different across the organization.
Hollinger finds new ways to frame principles that may have become clichés that no longer register strongly. Take, for instance, safety theorist James Reason’s famous model of layered defenses against risk, each layer represented by a slice of Swiss cheese, with the holes representing gaps in each layer of the defense. The slices are constantly shifting, so that occasionally some holes line up, the defenses fail and an accident results.2
Hollinger found the idea of spinning cheese slices unrealistic, so he created a new model to illustrate Reason’s thesis. His version is called “Stuck 7s,” based on old-style gambling machines with five wheels that turn when the player pulls the lever. Depending on what symbols are visible on the centerline when the wheels stop, the result may be (but usually is not) a money prize. If all five wheels stop at 7, the gambler hits the jackpot.
Should one wheel be stuck showing a 7, that slightly increases the odds of five 7s lining up. The probability is still low, but not as low as with a correctly operating machine.
Kent: How does that relate to the aviation safety model? Well, when we’ve established multiple defenses, and then we negate one of them, basically we have given ourselves a Stuck 7.
Stuck 7s come when we do nonstandard procedures or workarounds, when we do a checklist by memory because, “Oh, I have that thing memorized. Why do I need to pull the card out?” … That’s how we get into trouble in aviation. We do this shortcut, this omission or this nonstandard practice and give ourselves a Stuck 7 and nothing bad happens. Everything’s fine. So we gain …
Kent: You get confident with this new shortcut or workaround. You keep using it for three weeks and now you’re really feeling good about it. Three months go by and you’re convinced that it is the right thing to do and there is no harm. … You may even go on to create a second Stuck 7. Then finally the odds catch up with you.
An SMS is often described as being more than a kind of organization or a set of procedures, as an underlying attitude. Greg, one of the class members, describes how he and his wife were walking down an aisle in a grocery store when he saw a glass jar of pickles fall onto the floor, spreading broken glass and liquid. He found an employee and reported it.
But he also guarded the mess until it was cleaned up, which he says “drove my wife crazy.” She wanted to get on with shopping.
Greg: I explained that an elderly person might come around the corner, not see the spill, slip on it, fall down, break a hip, have to go to the hospital and get a hip replacement, all because I couldn’t spend 10 minutes on guard until the store cleaned up the spill. I couldn’t live with that.
Kent: You have no stake in the store and do not know the person who might slip, but you have an inner sense of responsibility for safety. …
So, what would people in your organization do if they saw a fuel spill, or some hydraulic fluid on the floor? Would they think, “Management had better clean that up before someone gets hurt”? Would they report the spill? Or would they also take action to make sure no one was injured until the spill is cleared? If they are too busy to stand by the spill, they could always place some cones or other objects around the spill. That is what is meant by a shared responsibility for safety, not just telling everyone to work safely.
By now, most people in safety-related positions in aviation are familiar with the basics of SMS, but even those who have been introduced to them through classwork will find Hollinger’s book a vivid and thought-provoking refresher.
- Marx, David. Whack-a-Mole: The Price We Pay for Expecting Perfection. Plano, Texas, U.S.: By Your Side Studios, 2009. Discussed in ASW, 7/09, pp. 52–54.
- Reason, James. Managing the Risks of Organizational Accidents. Farnham, Surrey, England, and Burlington, Vermont, U.S.: Ashgate, 1997.