eBook - ePub

Reliability, Maintainability and Risk

Name: Reliability, Maintainability and Risk
Author: David J. Smith

Practical Methods for Engineers including Reliability Centred Maintenance and Safety-Related Systems

David J. Smith,

436 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Reliability, Maintainability and Risk

Practical Methods for Engineers including Reliability Centred Maintenance and Safety-Related Systems

David J. Smith,

Book details

Book preview

Table of contents

Citations

About This Book

Reliability, Maintainability and Risk: Practical Methods for Engineers, Eighth Edition, discusses tools and techniques for reliable and safe engineering, and for optimizing maintenance strategies. It emphasizes the importance of using reliability techniques to identify and eliminate potential failures early in the design cycle. The focus is on techniques known as RAMS (reliability, availability, maintainability, and safety-integrity).

The book is organized into five parts. Part 1 on reliability parameters and costs traces the history of reliability and safety technology and presents a cost-effective approach to quality, reliability, and safety. Part 2 deals with the interpretation of failure rates, while Part 3 focuses on the prediction of reliability and risk. Part 4 discusses design and assurance techniques; review and testing techniques; reliability growth modeling; field data collection and feedback; predicting and demonstrating repair times; quantified reliability maintenance; and systematic failures. Part 5 deals with legal, management and safety issues, such as project management, product liability, and safety legislation.

8th edition of this core reference for engineers who deal with the design or operation of any safety critical systems, processes or operations
Answers the question: how can a defect that costs less than $1000 dollars to identify at the process design stage be prevented from escalating to a $100, 000 field defect, or a $1m+ catastrophe
Revised throughout, with new examples, and standards, including must have material on the new edition of global functional safety standard IEC 61508, which launches in 2010

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes, you can access Reliability, Maintainability and Risk by David J. Smith in PDF and/or ePUB format, as well as other popular books in Business & Insurance. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Butterworth-Heinemann

Year

2011

ISBN

9780080969039

Edition

Topic

Business

Subtopic

Insurance

Index

Business

Chapter 1. The History of Reliability and Safety Technology

Safety/Reliability engineering did not develop as a unified discipline, but grew out of the integration of a number of activities, previously the province of various branches of engineering.

Since no human activity can enjoy zero risk, and no equipment has a zero rate of failure, there has emerged a safety technology for optimizing risk. This attempts to balance the risk of a given activity against its benefits and seeks to assess the need for further risk reduction depending upon the cost.

Similarly, reliability engineering, beginning in the design phase, attempts to select the design compromise that balances the cost of reducing failure rates against the value of the enhanced performance.

The abbreviation RAMS is frequently used for ease of reference to reliability, availability, maintainability and safety-integrity.

1.1. Failure Data

Throughout the history of engineering, reliability improvement (also called reliability growth), arising as a natural consequence of the analysis of failure, has long been a central feature of development. This ‘test and correct’ principle was practiced long before the development of formal procedures for data collection and analysis for the reason that failure is usually self-evident and thus leads, inevitably, to design modifications.

The design of safety-related systems (for example, railway signaling) has evolved partly in response to the emergence of new technologies but largely as a result of lessons learnt from failures. The application of technology to hazardous areas requires the formal application of this feedback principle in order to maximize the rate of reliability improvement. Nevertheless, as mentioned above, all engineered products will exhibit some degree of reliability growth even without formal improvement programs.

Nineteenth- and early twentieth-century designs were less severely constrained by the cost and schedule pressures of today. Thus, in many cases, high levels of reliability were achieved as a result of over-design. The need for quantified reliability assessment techniques during the design and development phase was not therefore identified. Therefore, failure rates of engineered components were not required, as they are now, for use in prediction techniques and consequently there was little incentive for the formal collection of failure data.

Another factor is that, until well into the twentieth century, component parts were individually fabricated in a ‘craft’ environment. Mass production, and the attendant need for component standardization, did not apply and the concept of a valid repeatable component failure rate could not exist. The reliability of each product was highly dependent on the craftsman/manufacturer and less determined by the ‘combination’ of component reliabilities.

Nevertheless, mass production of standard mechanical parts has been the case for over a hundred years. Under these circumstances defective items can be readily identified, by inspection and test, during the manufacturing process, and it is possible to control reliability by quality-control procedures.

The advent of the electronic age, accelerated by the Second World War, led to the need for more complex mass-produced component parts with a higher degree of variability in the parameters and dimensions involved. The experience of poor field reliability of military equipment throughout the 1940s and 1950s focused attention on the need for more formal methods of reliability engineering. This gave rise to the collection of failure information from both the field and from the interpretation of test data. Failure rate databanks were created in the mid-1960s as a result of work at such organizations as UKAEA (UK Atomic Energy Authority) and RRE (Royal Radar Establishment, UK) and RADC (Rome Air Development Corporation, US).

The manipulation of the data was manual and involved the calculation of rates from the incident data, inventories of component types and the records of elapsed hours. This was stimulated by the advent of reliability prediction modeling techniques that require component failure rates as inputs to the prediction equations.

The availability and low cost of desktop personal computing (PC) facilities, together with versatile and powerful software packages, has permitted the listing and manipulation of incident data with an order of magnitude less effort. Fast automatic sorting of data encourages the analysis of failures into failure modes. This is no small factor in contributing to more effective reliability assessment, since raw failure rates permit only parts count reliability predictions. In order to address specific system failures it is necessary to input specific component failure modes into the fault tree or failure mode analyses.

The requirement for field recording makes data collection labor intensive and this remains a major obstacle to complete and accurate information. Motivating staff to provide field reports with sufficient relevant detail is an ongoing challenge for management. The spread of PC facilities in this area will assist in that interactive software can be used to stimulate the required information input at the same time as other maintenance-logging activities.

With the rapid growth of built-in test and diagnostic features in equipment, a future trend ought to be the emergence of automated fault reporting.

Failure data have been published since the 1960s and each major document is described in Chapter 4.

1.2. Hazardous Failures

In the early 1970s the process industries became aware that, with larger plants involving higher inventories of hazardous material, the practice of learning by mistakes was no longer acceptable. Methods were developed for identifying hazards and for quantifying the consequences of failures. They were evolved largely to assist in the decision-making process when developing or modifying plants. External pressures to identify and quantify risk were to come later.

By the mid-1970s there was already concern over the lack of formal controls for regulating those activities which could lead to incidents having a major impact on the health and safety of the general public. The Flixborough incident in June 1974 resulted in 28 deaths and focused public and media attention on this area of technology. Successive events such as the tragedy at Seveso in Italy in 1976 right through to the Piper Alpha offshore and more recent Paddington rail and Texaco Oil Refinery incidents have kept that interest alive and resulted in guidance and legislation, which are addressed in 19 and 20.

The techniques for quantifying the predicted frequency of failures were originally applied to assessing plant availability, where the cost of equipment failure was the prime concern. Over the last twenty years these techniques have also been used for hazard assessment. Maximum tolerable risks of fatality have been established according to the nature of the risk and the potential number of fatalities. These are then assessed using reliability techniques. Chapter 10 deals with risk in more detail.

1.3. Reliability and Risk Prediction

System modeling, using failure mode analysis and fault tree analysis methods, has been developed over the last thirty years and now involves numerous software tools which enable predictions to be updated and refined throughout the design cycle. The criticality of the failure rates of specific component parts can be assessed and, by successive computer runs, adjustments to the design configuration (e.g. redundancy) and to the maintenance philosophy (e.g. proof test frequencies) can be made early in the design cycle in order to optimize reliability and availability. The need for failure rate data to support these predictions has therefore increased and Chapter 4 examines the range of data sources and addresses the problem of variability within and between them.

The value and accuracy of reliability prediction, based on the concept of validly repeatable component failure rates, has long been controversial.

First, the extremely wide variability of failure rates of allegedly identical components, under supposedly identical environmental and operating conditions, is now acknowledged. The apparent precision offered by reliability prediction models is thus not compatible with the accuracy of the failure rate parameter. As a result, it can be argued that simple assessments of failure rates and the use of simple models suffice. In any case, more accurate predictions can be both misleading and a waste of money.

The main benefit of reliability prediction of complex systems lies not in the absolute figure predicted but in the ability to repeat the assessment for different repair times, different redundancy arrangements in the design configuration and different values of component failure rate. This has been made feasible by the emergence of PC tools (e.g. fault tree analysis packages) that permit rapid reruns of the prediction. Thus, judgements can be made on the basis of relative predictions with more confidence than can be placed on the absolute values.

Second, the complexity of modern engineering products and systems ensures that system failure is not always attributable to single component part failure. More subtle factors, such as the following, can often dominate the system failure rate:

• failure resulting from software elements

• failure due to human factors or operating documentation

• failure due to environmental factors

• failure whereby redundancy is defeated by factors common to the replicated units

• failure due to ambiguity ...

Cover image
Table of Contents
Front matter
Copyright
Preface
Acknowledgements
Chapter 1. The History of Reliability and Safety Technology
Chapter 2. Understanding Terms and Jargon
Chapter 3. A Cost-Effective Approach to Quality, Reliability and Safety
Chapter 4. Realistic Failure Rates and Prediction Confidence
Chapter 5. Interpreting Data and Demonstrating Reliability
Chapter 6. Variable Failure Rates and Probability Plotting
Chapter 7. Basic Reliability Prediction Theory
Chapter 8. Methods of Modeling
Chapter 9. Quantifying the Reliability Models
Chapter 10. Risk Assessment (QRA)
Chapter 11. Design and Assurance Techniques
Chapter 12. Design Review, Test and Reliability Growth
Chapter 13. Field Data Collection and Feedback
Chapter 14. Factors Influencing Down Time
Chapter 15. Predicting and Demonstrating Repair Times
Chapter 16. Quantified Reliability Centered Maintenance
Chapter 17. Systematic Failures, Especially Software
Chapter 18. Project Management and Competence
Chapter 19. Contract Clauses and Their Pitfalls
Chapter 20. Product Liability and Safety Legislation
Chapter 21. Major Incident Legislation
Chapter 22. Integrity of Safety-Related Systems
Chapter 23. A Case Study
Chapter 24. A case study
Chapter 25. A Case Study
Appendix 1. Glossary
Appendix 2. Percentage Points of the Chi-Square Distribution
Appendix 3. Microelectronics Failure Rates
Appendix 4. General Failure Rates
Appendix 5. Failure Mode Percentages
Appendix 6. Human Error Probabilities
Appendix 7. Fatality Rates
Appendix 8. Answers to Exercises
Bibliography
Appendix 10. Scoring Criteria for BETAPLUS Common Cause Model
Appendix 11. Example of HAZOP
Appendix 12. HAZID Checklist
Appendix 13. Markov Analysis of Redundant Systems
Index