Fault Tree Analysis - Finding root causes of failures
When a complicated system fails, it's not easy to always understand what has caused the failure. For example after an employee evaluation, if people are dissatisfied with the entire exercise, it's not immediately apparent what the cause could be. There can be many factors thathave to occur simultaneously, or several individual factors operating independently. Perhaps both. If your car fails to start, there isn't always just one cause or even a fixed set of causes. Imagine how much more complex it must be to analyze the failure of an integrated circuit with its millions of components all networked together. In order to assist us with the evaluation of high-level faults, engineers have developed Fault Tree Analysis (FTA.)
How Fault Tree Analysis Works
A lot of people don't see the need for a formal way of analyzing errors. There are many reasons for this and documentation is an important one. Even if a single person is able to completely comprehend a system and figure out what's causing the problem, that knowledge cannot be easily transferred to other people and it can't become part of the organization's knowledge base without a structured way of presenting it.
A single fault tree analysis is centered around just one fault. If there are many faults, we will need many fault tree diagrams. Each diagram starts with the main problem at the top of the page. Experts are then consulted as to what could have caused it. If the system is a complex one, we will end up with many different causes - some of which must occur concurrently to generate the fault. Sometimes the fault will be triggered only when certain environmental conditions are met - such as humidity and temperature. In other situations, a single problem elsewhere is enough to cause the error.
Using specific symbols which were initially used for analyzing electronic systems, we're able to develop a "tree" which will give the viewer an exhaustive overview of the high level fault. In addition, we can also assign probabilities to individual events and using the rules or probability theory we're able to compute the chances that the high level fault will occur.
This ties in neatly with the FMEA analysis where we require such probabilistic information in order to determine the priority of a fault. Thus the two techniques go hand in hand and together provide a much better picture of the cause of the error. This information can then be used to make the production systems more robust, even proactively, and allow for a higher quality of product.