Toward Risk Prediction and Mitigation
FAA Is Taking Steps to Improve Data, but Challenges for Managing Safety Risks Remain
Gerald L. Dillingham, Ph.D., director, physical infrastructure issues, U.S. Government Accountability Office (GAO). Testimony before the U.S. House of Representatives. GAO-12-660T. April 25, 2012. 20 pp. www.gao.gov/products/GAO-12-660T.
Like the U.S. aviation industry itself, its regulator, the U.S. Federal Aviation Administration (FAA), is shifting its emphasis away from “backward-looking” data — such as analysis of accidents — and toward risk prediction and mitigation strategies. Since 1998, as part of that new principle, the FAA has partnered with the airlines in the Commercial Aviation Safety Team to identify “sleeping” precursors to accidents and root them out before they cause mischief. Such an approach must be heavily data-driven because latent causal factors may only become apparent in huge numbers of observations.
Dillingham began by outlining the FAA’s processes designed to help ensure the availability of quality data. “For example, FAA established an agency-wide order on data management that specifies the roles and associated responsibilities for data management within the agency,” he said.
“This order applies to all sharable information from FAA and other sources used to perform the agency’s mission.”
The FAA’s Office of Aviation Safety created a four-step process for importing data from other FAA offices and outside sources:
- “Data acquisition — obtaining information from various data owners;
- “Data standardization — validating data by comparing a new data set with previous data sets to identify inconsistencies;
- “Data integration — translating data values into plain English and correcting data errors; [and,]
- “Data loading — importing data into the agency’s own systems.”
Dillingham said that the FAA has developed training for users of data systems and had some controls in place to ensure that erroneous data are identified, reported and corrected. “However, several of the databases lacked an important control in that managers do not review the data prior to entry into the system,” he said.
Data limitations and lack of some data hinder the FAA’s ability to manage safety risks, Dillingham said. He cited examples of what the GAO considered the FAA’s data use problems.
Changes to reporting policies: Operational errors by air traffic controllers “have increased considerably in recent years, with the rate nearly doubling for errors in the terminal area from 2008 to 2011. Multiple changes to reporting policies and processes during this time make it difficult to know the extent to which the recent increases in operational errors are due to more accurate reporting, an increase in the occurrence of safety incidents or both.”
He mentioned FAA’s instituting a policy of removing controllers’ names from the incident report database, which the agency believes encouraged reporting and is responsible for the apparent increase in operational errors. But he said the agency “has not yet conducted an analysis to validate the linkage.”
Multiple reporting systems and incomplete data: “FAA’s current process for analyzing data on losses of separation captured by [two different systems] only assesses those incidents that occur between two or more radar-tracked aircraft. By excluding incidents such as those that occur between the aircraft and terrain or aircraft and protected airspace, FAA is not considering the systemic risks associated with many other airborne incidents.” The FAA says it will include other kinds of incidents in risk assessment before the end
Lack of coordination among data systems: The FAA is rich in safety reporting systems. They include the Air Traffic Safety Action Program (ATSAP), through which individual controllers report; the Air Traffic Quality Assurance (ATQA) database, used by quality assurance staff; the Traffic Analysis and Review Program (TARP), which captures incidents automatically at some air traffic control facilities; and the Risk Analysis Process (RAP), to which ATQA and TARP feed data.
An appendix to the testimony notes that the FAA also operates the Aviation Safety Information Analysis and Sharing (ASIAS) system and the Air Transportation Oversight System (ATOS).
Dillingham said, “Though both ATSAP and RAP look at some of the same types of incidents (e.g., airborne losses of separation), they had not coordinated on a common set of contributing factors to describe and analyze the incidents. As a result, it is difficult to compare the data and conduct comprehensive analyses. According to FAA officials, they are currently developing a common set of contributing factors for ATSAP and RAP, as well as a translation capability that will allow for the inclusion of historical data on contributing factors in future analyses.”
Limitations of pilot data: Since 1996, U.S. law has required airlines to conduct background checks before hiring pilots, and another law requires the FAA to develop a pilot records database fit for the purpose. “According to the Department of Transportation Inspector General (IG), FAA met the act’s initial milestone in developing a centralized electronic pilot records database that will include records previously maintained by air carriers,” Dillingham said. “However, the IG indicated that FAA needs to address the level of detail that should be captured from air carrier pilot training records — such as determining whether recurrent flight training will be included, determining how to transition from the current practices to the new database without disrupting information flow and deciding how to ensure the reliability of data.”
Lack of ramp incident data: “FAA still collects no comprehensive data on incidents in the ramp area, and the National Transportation Safety Board does not routinely collect data on ramp accidents unless they result in serious injury or substantial aircraft damage,” Dillingham said. “The lack of ramp incident data will pose a challenge as airports move to implement SMS [safety management systems].” The FAA responded to an earlier GAO recommendation for ramp incident monitoring that it does so indirectly via its oversight of airlines. The agency has also proposed requiring airports with air carrier operations to establish an SMS.
Not tracking runway excursions: “Runway excursions can be as dangerous as incursions; according to Flight Safety Foundation, excursions have resulted in more fatalities than incursions globally [ASW, 8/09, p. 12],” Dillingham said. “FAA does not have a process to track excursions, unlike [that] for runway incursions. We recommended in 2011 that FAA develop and implement plans to track and assess runway excursions. FAA agreed and will be developing a program to collect and analyze runway excursion data and is drafting an order to set out the definitions and risk assessment processes for categorizing and analyzing the data.
“However, according to our review of FAA’s plans, it will be several years before FAA has obtained enough detailed information about these incidents in order to assess risks.”
Difficulty ensuring safety standards for pilot schools and pilot examiners: The FAA is charged with oversight of the “gatekeepers” of initial pilot training, including U.S. Federal Aviation Regulations Part 141 pilot schools and pilot examiners. Dillingham said, “It was unclear from our analysis of FAA inspection data … whether FAA met its oversight requirements, because we could not determine the number of active entities that should have been inspected each year. FAA does not maintain a historical listing of pilot schools and examiners, and thus, we could not define the universe of active entities that was required to be inspected.
“Because of this data limitation, we could not determine the completion percentage of the inspections for either group. In November 2011, we recommended that FAA develop a comprehensive system for measuring its performance in meeting its inspection requirements for pilot schools and examiners. FAA acknowledged our recommendation and noted that (1) it needed to clarify its inspection requirements for pilot schools in the revision of its national oversight policy guidelines, and (2) its new designee management system, which would include oversight of pilot examiners, will provide more comprehensive data once it is developed.”
Dillingham concluded, “Shifting to a data-driven, risk-based safety oversight approach means that FAA needs data that are appropriate, complete and accurate to be able to identify system-wide trends and manage emerging risks. Furthermore, when implementing changes in safety data reporting systems, or processes used to assess and analyze data to determine risk, FAA needs to take into account how such changes might impact trend analysis. … While FAA is working diligently to improve its data in some instances, more work remains to address limitations and to collect additional data where necessary.”
Causal Factor Library
Lessons Learned From Transport Airplane Accidents, lessonslearned.faa.gov
The U.S. Federal Aviation Administration Aviation Safety Information Analysis and Sharing system website features a link to the “Lessons Learned” library, which uses the Web’s linking capability to illustrate accident causal factors.
“Three different ‘perspectives’ are used to arrange the accidents in this library and illustrate the complex interrelationship of accident causes,” the site says. “Each accident also contains at least one high-level lesson related to a threat element and at least one lesson related to a theme element.”
While that explanation sounds like educational jargon, the site is logically arranged and easy to use. The three top-level perspectives are “Airplane Life Cycle,” “Accident Threat Categories” and “Accident Common Themes.” Each perspective is shown by four photographs; clicking any photograph leads deeper into the subject.
For example, when you move the cursor into the “Airplane Life Cycle” section of the main screen and click the lower right photo, you see photos representing “Design/Manufacturing,” “Operational” and “Maintenance/Repair/Alteration.” Let us say you click the last subcategory. A description appears:
“As the airplane continues to be operated, maintenance is performed which is intended to keep the airplane in an airworthy condition. Repair may be necessary in order to correct damage or other events that might have occurred. Alterations may also be desired which change the configuration of the original design.”
You are offered two options: “Return to Airplane Life Cycle descriptions” or “View related accidents.” If you select the second option, a page appears with descriptions of relevant accidents, with further links to study each accident in detail.
One accident in which maintenance was a factor was the uncontained engine failure involving a McDonnell Douglas MD-88 at Pensacola, Florida, U.S., on July 6, 1996. The description of the accident says:
“The National Transportation Safety Board determined that the probable cause of this accident was the fracture of the left engine’s front compressor fan hub, which resulted from the growth to failure of a fatigue crack. A causal analysis undertaken as part of the accident investigation revealed the following: The crack initiated from an area of altered microstructure that was created during a hole drilling process by Volvo for Pratt & Whitney. The anomaly went undetected by Volvo’s production inspection system.”
Similarly, back at the main screen, clicking “Accident Common Themes” leads to “Flawed Assumptions,” “Human Error,” “Organizational Lapses,” “Pre-existing Failures” and “Unintended Effects.” Selecting “Human Error” brings up this description:
“This is the most common of all accident themes and exists in one form or another on nearly all accidents. It involves humans that, in the course of doing their work, make errors that are later shown to have caused, or substantially contributed to the accident. These are human actions that, if done correctly, result in a safe outcome, but if done incorrectly, can result in an accident. It also represents one of the greatest opportunities for advancing safety by the application of targeted interventions which are intended to reduce the risks for human error.” Once again, you can choose to open a page with accidents whose causal factors are related to the theme.
The “Accident Threat Categories” perspective leads to a more elaborate link tree; 18 subcategories such as bird hazards and in-flight upsets are listed, each connected to a description and list of related accidents.