Opinion Versus Evidence
A Systematic Review of the Effectiveness of Safety Management Systems
Thomas, Matthew J.W., and Westwood-Thomas Associates. Australian Transport Safety Bureau (ATSB). 46 pp. Figures, tables, references, appendix. November 2012, updated Dec. 10. Available at www.atsb.gov.au/publications/2012/xr-2011-002.aspx.
Safety management systems (SMS) have a vast amount of academic management theory behind them, and their principles seem logical. While there is some variation in views of the components of an SMS, they generally include identification of safety hazards; remedial action to reduce those hazards; continuous monitoring of safety performance; and continuous improvement of the SMS itself. SMS might be said to represent a fundamental conceptual change in risk management. The emphasis shifts from compliance with reactive, externally generated procedures and regulations “written in blood” — that is, based on costly lessons from accidents — toward internal analysis of hazards uncovered in normal operations. Accident causal factors can be anticipated and, as far as possible, mitigated before they do their worst.
It is an exciting prospect, with a touch of magic. We can take charge of the future rather than just waiting to see what it throws at us. SMS has been enthusiastically adopted by operators and regulators.
For example, this ATSB report cites an Australian Civil Aviation Safety Authority (CASA) document that it says “dedicates a whole appendix to ‘selling’ the benefits of an SMS.” Among the suggested benefits of an SMS are a reduction in incidents and accidents; reduced direct and indirect costs; safety confidence among the traveling public; reduced insurance premiums; and proof of diligence in the event of legal or regulatory safety investigations.
But science insists: Prove it.
That, it turns out for the authors of this ATSB report, is a tall order. Their report says, “Unfortunately, [the CASA document] appendix makes no reference to any scientific evidence to support these claims, nor legal evidence with respect to due diligence. Indeed, much of the regulatory effort with respect to the adoption of SMS as the primary regulatory platform has been characterised on uncritical acceptance, and based on expert opinion and face validity, rather than subjected to formal scientific validation.
“Previous published reviews of SMS research do not appear to provide strong empirical evidence to support the specific benefits of adopting an SMS. For instance, the summary of a 2006 review of evidence for the effectiveness of SMS across a wide cross-section of industries suggests that there has been a ‘less than expected’ reduction in accident occurrence since the implementation of SMS.” (References can be found in the original report.)
ATSB commissioned Matthew Thomas and Westwood-Thomas Associates to undertake a meta-analysis of SMS research. The authors began with a comprehensive search of the literature and found 2,009 articles, a promising start. However, the great majority of the sources washed out because of rigorous inclusion criteria. Among other requirements were that only peer-reviewed articles published between 1980 and 2012 were accepted; studies “must have clearly defined a research question that related to the effectiveness of safety management systems, or specific components of a safety management system”; studies must have defined effectiveness in terms of safety-related outcomes, rather than other standards such as improved productivity; and studies must have reported quantitative measures. There was also a quality appraisal based on published guidelines for methodological soundness.
Ultimately, 37 papers were determined to be directly relevant to the objectives of the investigation. However, “only 14 [studies] involved an SMS designed to avoid low-probability/high-consequence (LP-HC) accidents” — one way of looking at aircraft accidents — “with the remaining 23 studies relating to work health and safety,” the report says. “In addition, very few of these studies were undertaken in transport domains, and many studies only measured subjective perceptions of safety rather than objective measures. The limited [amount of] quality empirical evidence available relates to the difficulty of measuring objective safety improvements in industries where the SMS is aimed at avoiding LP-HC accidents and the relative recency of the application of SMS.”
Even among the 37 papers accepted for analysis, the study’s authors were less than fully satisfied with the quality of evidence. Only a single study met the scientific “gold standard,” a randomized, controlled trial. Of the 37 articles included in the systematic review, 19 used objective measures of safety performance. And 15 of the 19 related to workplace health and safety, using such metrics as occupational injuries to workers. “Of these studies, the majority demonstrated significant positive effects with respect to dimensions of SMS,” the report says. “A number of studies found general relationships between SMS implementation and safety performance.”
Eighteen of the 37 articles analyzed in the systematic review used only subjective, self-reported measures of safety performance, most with a survey-based methodology in which both individual perceptions of effectiveness of SMS components and safety metrics were subjective.
The report notes, however, that across multiple studies, there was scant agreement about which components of an SMS individually caused change in safety performance.
The four studies of L-P/H-C industries, probably the most relevant to aviation, demonstrated “no consistent findings … with respect to performance on various dimensions of an SMS and poor safety outcomes. …
“Several studies explored the relationships between components of SMS and safety performance in the context of major hazard facilities. The first of these studies from an oil refinery environment established a relationship between self-reported safety performance and the two components of (1) management commitment and (2) safety communication. A second study, undertaken by the same authors, found no direct effect of management commitment, but rather (1) supervision, (2) safety reporting and (3) team collaboration as the immediate drivers of safe work practices.
“Slightly different findings were obtained in another study, whereby (1) management commitment and (2) safety rules and procedures were found to be directly associated with safe work practices in major hazard facilities in India.”
One study seemed to offer some evidence of what factors were effective in improving safety performance. “This study, within the maritime domain, found that safety behaviour was influenced by safety policy and perceived supervisor behaviour rather than other components of safety management systems,” the report says. The authors of that study concluded that “shipping companies should therefore invest large amounts of money in developing and implementing safety rules, procedures and training.”
The report says, “In perhaps one of the most important studies [published in 2008] in terms of relevance to high-risk transport industries (using a cross-section of industries), there was no real relationship established between everyday safety performance and L-P/H-C events. This finding from the U.S. highlights the lack of clarity in what might actually be driving ultra-safe performance, and in many respects, the question as to SMS effectiveness is unable to be adequately answered by even the most recent research.”
The report questions the validity of surveys and structural equation modeling — a statistical technique used to explore the relationship between a number of different factors, and their relationship to a particular outcome — in this research context. Using such a methodology, it says, “to tease out the inter-relationships between components of safety management systems, safety climate factors and safety performance might not assist in clarifying the complex set of factors influencing safety performance, and does not really assist in enhancing our understanding with respect to establishing the effectiveness of SMS.”
A particular problem with surveys and self-reporting is that they “fail to utilise a standard set of instruments, thus leaving the industry unsure of exactly what is being measured. Furthermore, there is a tendency to infer causality from the findings of these models, inasmuch as increased management commitment leads to reduced rates of safety occurrence. No such directional causality can be inferred through these study designs, and … each of these studies is limited from the perspective of common method variance.”
In other words, while an association may be found between an SMS “model” and better safety-related behavior, it is not clear whether a causal relationship exists. And if one does, it has been impossible to determine if the causal factor is one element of SMS, more than one element or the SMS in its entirety. Another way of looking at the data, as suggested by the maritime study, is that management commitment rather than SMS is the active ingredient.
Textbooks about worker behavior invariably discuss the “Hawthorne effect,” derived from a series of studies conducted at the Western Electric Hawthorne Works in Cicero, Illinois, U.S., beginning in the 1920s. Experimenters tested the effect of increasing or decreasing the lighting in the employees’ work environment, as well as other variables, on productivity. The researchers found that productivity improved with any change, even if it was only reversion to a previous condition. Their eventual conclusion was that the output improved either because the employees were aware that they were being studied or because managers seemed to care about the quality of their working environment.
Although, as with most scientific studies, the conclusions about the Hawthorne effect have since been questioned, it is generally accepted that observation affects behavior and doesn’t merely measure it. So perhaps part of the reason for safety improvement attributed to SMS — if there is objective improvement — is that the SMS is on everyone’s mind, more than any theoretical content of the system.
The report concludes with some thoughts about the “frameworks,” “models” and “strategies” that have been upgraded in status to SMS. It says:
“There is a well-known axiom that states, ‘there was never a randomised control trial for the effectiveness of the parachute.’ This is to say that there has never been a study in which one group jumps from an aeroplane with a parachute, and their survival is compared with a group that jumps in exactly the same conditions, but without a parachute.
“The argument here is simple: Some interventions just do not require large-scale experiments to establish their effectiveness. Many interventions are based on first principles, that are things that we already know to be true, and logic. Safety management systems contain many of these elements. For instance, logic simply dictates that if you are to prevent the reoccurrence of an event, you need to understand what caused the event, and put in place strategies such that those causes are prevented from occurring again. Hence, the need for accident investigation is a simple logical necessity that requires no empirical evidence to support its use within safety management processes.
“This review of the scientific literature suggests that this logical necessity, which many might call ‘common sense,’ has driven much of the development of safety management systems.”
If so, an SMS is a codification of principles learned through experience, an evolution rather than a revolution.
The report suggests another concern, which is that “it just might be the case that the ever-growing list of components of a safety management system may well result in dilution of effort across the spectrum of safety management activities. This dilution of effort may well result in poorer safety performance as the critical components receive less time and effort at the expense of yet another ‘good idea’ dressed up as a legitimate safety program. Given that, at present, there is no clear objective empirical evidence as to whether there are any critical elements, this is a real possibility.”
Scientists, however, have a saying that “absence of evidence is not evidence of absence.” Given the practical methodological difficulties of studying SMS, it is not surprising that demonstrating its effectiveness remains beyond current findings. It would be a rare flight operations department that would agree to deliver the presumed benefits of an SMS to half its operations while denying them to the other, “control” half — particularly because the system represents a continuing process, not a quick fix.
The report concludes, “Even within a vacuum of evidence, the precautionary principle states that we must not fail to take precautionary action. To this end, it is likely that the current regime of an aggregate set of components assembled into something, which we call a ‘safety management system,’ remains an important tool in the management of safety.”
From Strategy to Action
2012 European Strategy for Human Factors in Aviation
European Human Factors Advisory Group (EHFAG). First issue, Sept. 1, 2012. 8 pp.
A European Human Factors Strategy has been developed by the EHFAG in conjunction with the European Aviation Safety Agency (EASA). “The strategy sets out to achieve two principal functions,” this report says. “First, to foster consistency in the integration of human factors principles in the regulation, governance, system design, training, licensing, audit and assurance of aviation activities. Second, it outlines how the practical understanding and application of human factors can serve in enhancing safety performance across the aviation safety system. The strategy serves as a framework document to support the European Aviation Safety
The strategy encompasses Europe’s aviation system as a whole — “rule makers, authorities, investigators, researchers, service providers, industry and other stakeholders.”
The EHFAG has significant input to EASA rules and advisory materials, and in turn, the EASA rules affect operators around the world. For example, there are about 6,000 repair stations with U.S. Federal Aviation Regulations Part 145 certificates. About 1,300 of those, mainly the larger U.S. repair stations, are EASA-certificated. Many technicians working in the United States are following EASA rules.
The EHFAG describes its guiding principles:
“Providing appropriate governance and leadership — This would include consideration of appropriate HF [human factors] expertise within the regulatory organisations.”
“Developing a balanced regulatory structure — Human factors principles will be addressed in all the aviation regulations, whilst recognising the need for the regulation to be proportionate with an appropriate balance between rule, acceptable means of compliance (AMC) and guidance material.”
“Providing guidance and interpretive material — Adequate tools, guidance and AMC material to help industry apply human factors principles will be provided. [Material] will help regulators oversee the effective implementation of human factors by industry and in incident and accident investigations.”
“Promoting the importance of human factors — At a European level through the EASA website, regular newsletters and bulletins, and EASA conferences; at a national level through the cascading of EASA promotion and national conferences.”
“Coordinating activities — across organisations, including regulatory organisations, to avoid the transfer of risk from one domain to another. This coordination should be across both European and non-European aviation systems (e.g., with FAA [U.S. Federal Aviation Administration]). This should also include coordination with other safety organisations and initiatives such as the European Strategic Safety Initiative; Advisory Council for Aviation, Research, Innovation in Europe; Eurocontrol; and the implementation of safety management systems. EASA and the EHFAG should seek opportunities to influence and coordinate human factors with international bodies such as ICAO [International Civil Aviation Organization].”
Lessons should be learned and shared from many sources, including accident investigations, data analysis and operational experience, the report says.
The EHFAG’s next step will be to develop an action plan from the strategy, converting it into a detailed human factors program by the end of June 2013. “Priority of tasks and actions will be based on the impact to the overall improvement of safety performance,” the report says. An appendix lists specific components of the action plan under several headings, such as “training and competency” and “regulation and rulemaking.”