1 An Introduction to Failure Analysis
The people weâve worked with started doing in-depth failure analysis on industrial equipment in the mid-1960s. Prior to that time there werenât a lot of industrial failure analyses, and the ones that were done were just involved with trying to understand the physical causes. The efforts of those early folks were primarily linked with an interest in improving production equipment reliability and capacity in chemical plants. From their work and a manufacturing and processing viewpoint, it wasnât until the early 1970s that a realization began to develop that the true sources of industrial problems were much more complex.
What is failure analysis? There are probably as many definitions as people you ask the question of, but we prefer to think of it as âthe process of interpreting the features of a deteriorated system or component to determine why it no longer performs the intended functionâ. Failure analysis entails using deductive logic to find the physical and human causes of the failure, then using inductive logic to find the latent causes. From an understanding of these âfailure rootsâ, there should be a path to the changes needed to prevent the recurrence of the incident.
Some people in industry prefer not to use the term failure analysis, and more than once we have heard a statement such as âWe donât want our maintenance improvement (or some similar) program being driven by concentrating on failuresâ. Itâs easy to understand their words but impossible to understand their logic. Most of us learn from our mistakes and, in the same manner that professional athletes use when they study game videos or farmers use in analyzing soils and crop yields, failure analysis allows us to look at our weaknesses and errors, gain knowledge from them, and try to do a better job the next time.
This book is an attempt at a manual that explains how and why mechanical machinery fails and how to solve those problems. Realizing that no single text can address all failures, this book tries to explain how the basic failure mechanisms occur, the things we all do to cause machinery problems, how to recognize those things, and what to do to prevent future similar incidents. Unfortunately, there is an almost infinite number of failure symptoms and appearances and the book canât address all of them. But it should allow the careful reader to analyze and solve by far the majority of the mechanical failures that occur in the typical paper mills, chemical plants, power plants, and manufacturing facilities.
THE CAUSES OF FAILURES
Why are there premature equipment failures? When the people closely involved with the failure are asked this question, they almost always say it is âthe other guyâs faultâ. If one were to ask a plant millwright or a maintenance mechanic, the most likely answer to that question would be âoperator errorâ. But if the same question were asked of an operator who worked in the plant with that millwright, their answer might be âbecause it wasnât properly repairedâ. At times there is some validity to both of these answers, but the honest and complete answer is always much more complex.
It would be nice and neat if there were only one cause per failure, because eliminating the problem would be easy, but in reality, there are multiple causes to every equipment problem. Unfortunately, there are many people who believe that there is only one cause for a failure. However, look at the analysis of any well-studied major disaster and ask if there was only one cause. Was there a single cause for the BP oil well disaster? ⌠Three Mile Island? ⌠the Exxon Valdez mess? ⌠Bhopal? ⌠Chernobyl? ⌠a major airplane crash? The analyses of these and other, well-recognized and extensively studied failures show that they all have multiple causes. Then, why would any intelligent person believe a typical pump or fan failure would be different? In the case of Three Mile Island, there were three huge studies, each commissioned by one of the responsible groups. All three of the studies said there were numerous causes but that it was âprimarily the fault of the other two organizationsâ. In doing failure analyses, it is often amusing to listen to the management staff talk about how the workforce employees âmessed upâ without any recognition at all of how their engineering and management practices were involved.
At an international conference on failure analysis, a presentation was made on the causes of aircraft equipment failures. The presentation data showed:
- ⢠30% â Manufacturing Errors
- ⢠26% â Design Errors
- ⢠23% â Maintenance Errors
- ⢠18% â Material Selection
- ⢠3% â Operation
During the question-and-answer session after the presentation, a member of the audience asked the speaker why they had listed only one cause for each failure when there were usually multiple causes. The speaker agreed with the questionerâs point, but then said, âThere was only one blank on the formâ. This answer is a quote and an interesting testimony to the general publicâs lack of perception.
When people discuss cases that have been carefully studied, such as those listed earlier, they almost always agree that there are multiple causes for each. Yet when directly involved with a failure, the ability to be objective seems to disappear and, ignoring reality, many people come up with conclusions such as those mentioned in the presentation above. They then take this data, draw an attractive pie chart or bar graph, and point to a nice neat single cause for every failure ⌠when an honest analysis clearly states that is neither true nor logical.
Two comments:
- A. At a later time in the session, another group analyzed the same basic airplane equipment failure data set but reached very different conclusions. They too sorted the data with the idea of a single cause for each failure!
- B. One of the questions that I find interesting is, âWhy donât these people recognize that many failures ha...