We have discussed in recent blogs failure in time (FIT) rates (Have you got your diagnostics covered?) and their usage in functional safety activities. The topic of (functional) safety is one element of a much wider subject area – that of reliability, availability, maintainability and safety (RAMS). We operate as company mainly in two industries automotive and medical, but we dabble in other sectors from time to time and the industry that arguably has the most encompassing approach to RAMS is the rail sector – so we will take a look at some of the key points in EN 50216 and its approach to the subject. Automotive and medical device sectors address safety fairly comprehensively, but what about the R, the A and the M?
Reliability is a key topic for any product development. Defining an expected service life is a requirement in most industry standards, but in some sectors, this can be a critical factor. Just think of devices installed in remote locations or devices that cannot be serviced readily. We have worked on projects where devices may be installed in a desert or the artic.
Reliability is defined as “ability to perform as required, without failure, for a given time interval, under given conditions”.
One key technique for assessing the reliability of a system is a reliability block diagram (RBD), which can be used to define the probability of the loss of the system function. Figure 1 illustrates a simple RBD:
The usage of FIT rates (denoted by lambda λ) enables the reliability of a system to be calculated either mean time to failure MTTF for non-repairable systems or mean time between failures for repairable systems MTBF ≈ 1 /λ.
In Figure 1 on the basis that the FIT rate for TR1 is too high and the dominate failure mode of a transistor is short circuit, a redundant circuit TR2 and R2 can be added and hence yielding a more reliable system.
However, one assumption often taken is that using redundancy will automatically yield a more reliable system. In the excellent reference from Alessandro Birolini – “Reliability Engineering” – quite the contrary is suggested. Redundant elements have ever diminishing returns in producing a reliable system (with ever increasing cost), the key steps and priority sequence are as follows:
- Destressing components (thermal, voltage, mechanical)
- Ensuring interfaces are properly defined
- EMC & ESD design is appropriate
- Lastly redundancy
In the case of a system with ‘n’ identical redundant elements, the MTTF is calculated as follows:
MTTF = (1/λ)* (1+1/2+…+1/𝑛)
Availability – ability of an item to be in a state to perform a required function under given conditions at a given instant of time or over a given time interval, assuming that the required external resources are provided.
There are four key principal strategies for improving availability:
- Provision for operation in degraded mode, in the presence of a failure
- Improving the maintainability of the system i.e. increasing the MTBF
- Provision of sufficient resources (staff, test equipment, spares)
- Provision for redundancy to prevent failures resulting in a loss of function (see point above about redundancy)
Maintainability is a combination of all technical and management actions intended to retain an item in, or restore it to, a state in which it can perform as required.
Key points for maintainability:
- Frequency and time for the performance of planned or unplanned maintenance
- Time for detection and identification of the faults
- Time for the restoration of the failed system (unplanned maintenance)
The role of human factors
One of the key strengths of the EN 50216 standards is building the bridge between technical safety concerns and human factors. A topic we have covered in previous blogs in particular concerning autonomous driving (SOTIF – The Human Factor). EN 50216 gives an excellent summary of the aspects that can impact RAMS when considering human factors. Particularly looking at both systematic (lack of training or competence) and random failures (lapses) in human beings. Defining many different aspects that require consideration in the RAMS lifecycle.
The ability to specify RAMS in any industry is important, as we mentioned at the beginning of this blog. Many aspects in the medical device safety standard IEC 60601 talk about expected service life of the product. ISO 26262 in the automotive sector works with FIT rates for hardware metrics, but there is very little attention given to how the R, the A and the M are achieved. A good reference source when considering ‘state of the art’ is the EN 50216 and associated standards, as they approach the subject in a pragmatic and comprehensive manner.
By Alastair Walker, Consultant