Every day, pilots, flight attendants, air traffic control officers (ATCOs), mechanics, and other frontline aviation professionals put their skills and judgment to the test to solve problems that cross their paths.
Be it a failure not covered by any checklists, noncompliant passengers, a recurrent glitch in the flight planning software, or unclear instructions in a maintenance manual, âsharp-endâ workers are always searching for ways to overcome challenges and accomplish their duties.
As desirable a trait as this can be on most occasions, this problem-solving attitude can sometimes be detrimental to safety, as it may cause risks to become normalized and hazards to go unreported. When employees deal with operational issues by âfixing and forgettingâ (as described by Canadian researchers Tanya Anne Hewitt and Samia Chreim)1 rather than searching for root causes and reporting difficulties, organizations are deprived of important opportunities to learn from and mitigate risks.
In this two-part article, we will explore diïŹerent problem-solving behaviors and how they impact safety. In Part One, we will delve into the cultural, organizational, and personal factors that contribute to issues being resolved without yielding lessons on how to prevent similar problems in the future. In Part Two, we will build on those insights and devise strategies to help foster a healthier reporting culture and motivate people to approach operational hurdles with a more inquisitive mindset.
Unfortunately, the consequences of inadequate problem-solving are often the subject of investigations of accidents or serious incidents. The eïŹects of fixing and forgetting may seem positive at first, but immediate, local success in dealing with an issue can generate a false sense of control and conceal an imminent failure. This was the case of a large business aviation operator in Paris and the dynamics that led to the midair encounter of one of their jets and a narrow-body airliner in January 2022.
Unresolved Issues
The Cessna Citation 525, based at Paris-Le Bourget Airport, had been scheduled to fly to Geneva with two pilots and two passengers on board. In its final report on the accident, the French Bureau dâEnquĂȘtes et dâAnalyses (BEA) said that, on their climb to Flight Level (FL) 270 (about 27,000 ft), the crew had engaged the autopilot (AP) in indicated airspeed (IAS) mode and maintained a speed of 200 kt. The climb had been uneventful until they crossed FL 185, when the pilots felt a sudden increase in the load factor. Scanning their instruments, they realized that their pitch had increased to an abnormal value and the airspeed indicators were presenting diïŹerent readings: 250 kt on the captainâs side and 150 kt on the first officerâs side.2
The crew disengaged the AP, re-established a normal pitch attitude and re-engaged the autopilot, now in vertical speed mode.
A few minutes later, the first officer, acting as pilot monitoring (PM), glanced at his altimeter and noticed that they had overshot FL 270, their cleared flight level. He alerted the captain, who was pilot flying (PF), about the level bust, and the captain checked his altimeter and saw that it indicated an altitude below FL 270, causing the criticality of the situation to become even more evident. Puzzled, the pilots leveled oïŹ and queried air traffic control (ATC) about their flight level. âFL 263,â the controller replied. This was consistent with the PFâs altimeter, yet the standby altimeter displayed a considerably diïŹerent value, FL 280, which closely matched the indication on the PMâs side. âWe have a small problem with our altimeters,â the captain said to the ATCO. After describing the issue, the controller, with a sense of urgency, indicated that they had traffic at 12 oâclock, 1,000 ft above them. Almost immediately, the Citation crew reported that an Embraer 170 had passed below them. The minimum separation between the jets was later calculated as 1.5 nm (2.8 km) horizontally and 665 ft vertically.
The subsequent BEA analysis concluded that an incorrectly installed hose in the Citationâs pitot-static system had likely caused a failure in the air data system that resulted in the unreliable indications received by the crew. Because the business jetâs transponder was receiving erroneous data from the faulty system, the decreasing proximity between the two aircraft did not cause the Embraerâs airborne collision avoidance system (ACAS) to generate advisory messages or the ATCO to receive a short-term conflict alert (the Citation was not equipped with an ACAS).
The investigation also revealed that the problems that the pilots had experienced were not new; they had presented themselves in 2017, 2019, and earlier in 2021, only a month before the incident. The BEA found that the business jet operatorâs culture of placing operational readiness over safety resulted in the fault not being detected despite several opportunities.
How could a problem in such a critical system be left unresolved for so many years?
According to the BEA report, the first time it occurred, the crew made an entry in the technical logbook and filed a safety report. The aircraft was then inspected by maintenance personnel, who detected debris in the left airspeed indicator system.
Approximately two years later, the issue recurred. This time, however, a miscommunication between the pilots resulted in no safety report being filed, and an overly concise failure description in the logbook prevented the technicians from detecting the fault. Another two years passed, and the problem presented itself once again. The pilots â one of whom had been part of the crew that experienced the fault the previous time â informed maintenance but did not submit a safety report, as a problem in the reporting software did not allow the first officer to complete the submission process.
Although the BEA investigation pointed to several reasons why the pitot-static system issue was not diagnosed and resolved sooner, the bureau noted that the operator had a deficient reporting culture that aïŹected both the information that was transmitted to its safety team and that which was used for technical troubleshooting. Maintenance issues were often analyzed by the chief pilot himself, who decided whether to ground the airplanes and where. Sometimes, as was the case for three of the four pitot-static systemârelated events (including the incident flight), crews would consult the chief pilot or management for guidance and were instructed to delay reporting faults until the aircraft was back at Le Bourget. At times, pilots were even told not to report any issues in the logbook, causing some problems to be âmagicallyâ discovered during scheduled maintenance and others to be left unresolved. These, in turn, gradually became part of the DNA of many of the operatorâs aircraft. Jets had their own âquirksâ (in the form of inoperative equipment and malfunctioning systems), and crews learned how to deal with them to keep operations running and minimize risks as much as possible in those substandard conditions.
Adapting to, and Adopting, Lower Standards
Human beings have remarkable adaptive capacity, Ohio State University Professor David Woods explains. This allows them to be able to respond to the most dynamic situations and solve problems successfully, especially when these problems fall outside what a system is designed to handle. However, adaptive capacity may be a double-edged sword. People do not solve problems in the same way, and, as the factors that led to the 2022 midair encounter demonstrate, their actions are not always conducive to safety. While adaptability is seen as a hallmark of resilient systems, sometimes the ways in which issues are dealt with by frontline staïŹ may be counterproductive to the management of risks within an organizational context â what Woods calls âmaladaptation.â3
A 2001 Harvard University study by researchers Anita Tucker, Amy Edmondson, and Steven Spear4 showed that individuals may adopt two diïŹerent types of responses when facing a problem: one that hinders organizational learning and one that is conducive to it.
The first type of response is first-order problem-solving, or fixing and forgetting â characterized by actions that resolve an issue and allow work to continue but do nothing to prevent the issue from recurring. This type of response is commonly witnessed in aviation. In aircraft maintenance, for example, the lack of the right tools to conduct a certain task may cause technicians to resort to other less adequate tools or to find alternative ways to accomplish the job. Although these âworkaroundsâ may not necessarily be unsafe or represent a procedural violation, they become problematic when the risky situation that rendered them necessary in the first place is left unreported and, consequently, unaddressed. First-order problem-solving hampers the gathering of data that could justify the adoption of corrective actions by management. If issues are only fixed but not communicated, they cannot be remedied.
The second type of response, called second-order problem-solving, or âfixing and reporting,â involves investigating the underlying causes of an issue with the goal of preventing its recurrence. This is the gist of safety management systems (SMS). Taking the time to carefully examine the problems that frontline staïŹ experience â as opposed to simply fixing them on the spot and resuming work â yields an important opportunity for organizational learning, one of the pillars of a safety culture, as emphasized by James Reason.
The Harvard researchers observed 22 nurses in eight diïŹerent hospitals for a total of 197 hours and concluded that first-order problem-solving prevailed over second-order problem-solving. The vast majority of responses to the issues were the ones that allowed the nurses to continue caring for the patients but ignored any possibility of investigating or changing the causes of the problems. So, what factors contributed to the predominance of first-order problem-solving?
First, they identified insufficient time to âfix and report.â Overcoming a problem and then reporting it takes longer than only fixing it and moving on. Similarly to how nurses did not have enough time to engage in any activity other than caring for patients, frontline aviation professionals often lack the time to pause their work to write a report, as simple a task as this might appear.
As a second contributing factor, the researchers found poor teamwork and a low degree of psychological safety. They explained that, when nurses would share their concerns with doctors or management, their findings would be dismissed, or they would be asked to âproveâ that they had indeed encountered a problem. Physicians and managers would also refrain from transmitting to nurses important information about patientsâ treatment plans and other critical processes due to the nursesâ perceived hierarchical level. â[W]e found that doctors, at times, treated nurses as low-status workers,â the research noted. In aviation, despite the widely acknowledged benefits of crew resource management, deficient teamwork and low psychological safety persist, especially between highly experienced pilots and young first officers, or between flight crew and cabin crew.
Next, the study found that nursesâ sense of fulfillment and pride stemming from challenges that they had been required to overcome in the past (and had done successfully) motivated them to prioritize adopting first-order problem-solving behaviors when faced with hurdles. For author Daniel Pink, intrinsic motivation has three components: purpose, mastery, and autonomy5 By repeatedly being exposed to situations that required them to use their skills, wits, and adaptive capacity, the nursesâ sense of mastery was bolstered, further reinforcing the âbenefitsâ of fixing and forgetting. Unfortunately, crews can also fall victim to the same phenomenon. The portrayal of pilots who âsave the dayâ as the epitome of airmanship may cause some crewmembers to believe that the best pilots are those who successfully circumvent adversities, rather than those who adopt careful risk management practices and diligently follow standard operating procedures.
While having nursesâ decisions questioned by medical practitioners and managers inhibited second-order problem-solving, giving them autonomy without support was equally harmful. The authors observed that âempoweringâ nurses through the removal of direct managerial assistance, as seen in some hospitals, caused the deterioration of eïŹective problem-solving when such initiatives did not include measures to bridge the gap between nurses and other members of the organization with whom they were required to interact. Additionally, they noted that any potential benefits in reducing managerial oversight were lost when professionals were already overburdened by their normal duties, leaving them no time to engage in the resolution of higher-level problems.
Lastly, the study revealed that resorting to local âquick fixes,â as opposed to more thoroughly planned and longer lasting adjustments, made nurses feel temporarily satisfied, a neurobehavioral process called temporal discounting. Also commonly referred to as âdiscounting the future,â this phenomenon causes individuals to weigh immediate rewards more heavily than longer-term ones â a behavior that was also evident for the French business jet operator. The focus seemed to be on completing flights no matter what technical issues may appear. Dealing with failures on the spot was more ârewardingâ than adopting a more strategic approach and aiming for long-term fleet availability â which would have been the most financially sound decision.
The Consequences
When first-order problem-solving defines how an organization and its members deal with issues, increasingly lower operating standards become accepted â a practice widely known in safety science as normalization of deviance. Tucker and her team concluded in their study that the prevalence of fixing and forgetting resulted in organizational systems becoming worse or staying the same over time. Nurses would feel proud of themselves for overcoming adversities and resolving issues, but without reporting and deeper analyses, they would also find themselves having to fix the same problems repeatedly. A similar sentiment likely also existed among the crews of the business jet operator.
By working as what the late Massachusetts Institute of Technology Professor Donella Meadows would call a âdestructive reinforcing feedback loop,â6 first-order problem-solving and the âquick fixesâ that it brings about cause second-order problem-solving to become an even more distant reality among work groups and allow local successes at overcoming issues to turn into widespread organizational vulnerabilities.
Image:  Yakobchuk Viacheslav / Shutterstock.com
Lucca Carrasco Filippo is a flight safety specialist and a former commercial helicopter pilot. He holds a degree in aeronautical sciences and multiple qualifications in safety management systems, human factors, and incident investigation.
Notes
- Hewitt, T A.; Chreim, S. (2015). âFix and Forget or Fix and Report: A Qualitative Study of Tensions at the Front Line of Incident Reporting.â BMJ Quality & Safety Volume 24 (Issue 5): 303â310.Â
- Bureau dâEnquĂȘtes et dâAnalyses (BEA). (2023). “Serious Incident Between the Cessna Citation 525 CJ Registered F-HGPG Operated by Valljet and the Embraer 170 Registered F-HBXG Operated by HOP! on 12 January 2022 RE Route south of Auxerre (Yonne).”Â
- Woods, D. âEssential Characteristics of Resilience.â In Resilience Engineering (1st ed., p. 21â33): Aldershot, Hampshire, U.K.: Ashgate, 2006.
- Tucker, A.L.; Edmondson, A.C.; Spear, S. (2002a). âWhen Problem Solving Prevents Organizational Learning.â Journal of Organizational Change Management Volume 15 (Issue 2): 122â137.Â
- Pink, D.H. âDrive: The Surprising Truth About What Motivates Usâ (1st ed). Riverhead Books, 2011.
- Meadows, D. âThinking in Systems.â White River Junction, Vermont, U.S.: Chelsea Green Publishing, 2008.