Reliability prediction has become an important aspect in the decision making process for companies when selecting networking and telecommunications equipment. It allows one to predict the frequency of failures over time and plan maintenance and repair costs.
Reliability prediction serves other purposes:
vTo assess the effect of product reliability on maintenance activities and spares forecasting
vTo provide input into system-level reliability modeling
vTo compare predicted reliability against competing products
vTo set goals for reliability testing or field performance
vTo determine reliability design impact on differing design architectures
Resysnet will perform reliability predictions based on industry standards, supplier data, and customer specific field performance measures. Base failure rates are derived from Telcordia's SR-332, Reliability Prediction Procedure for Electronic Equipment. This industry standard's base failure rates were derived from service providers field performance FIT rates form the likes of SBC, ATT, and Verizon.
Nonetheless, not all Engineering Design teams are alike and companies find that Telcordia predictions tend to be conservative.Resysnet will then also analyze your install base of product with statistically significant operating hours to determine a correlative factor of predictions over field data.
SYSTEM RELIABILITY ANALYSIS
Service providers require high levels of performance and availability due to consumer demand for "Always On" or "On Demand" type service.Downtime leads to customer dissatisfaction and a degraded customer base due to the impact on service availability from network outages.Product resilience to system failures must be estimated to assess the uptime performance of the system.Is your system 6-nines, 5-nines, or 4-nines and what are your customers expecting when deploying your product?These are all answered by performing a System Reliability Analysis (SRA).The scope of an SRA include:
vAnalysis of system architecure and redundancies
vAnalysis of single points of failure and critical failure modes
vDetailed review of the power distribution system, processor control, failover protocols, data paths, backplane design, and cooling system design
vAssessment of Software fault detection capabilities and identify potential issues impacting high availability (HA)
vMTTR (mean-time-to-repair) and recovery protocol analyses
vDowntime impact of software upgrades, maintenance activities, and repairs
vDevelopment of RBDs (reliability block diagrams) and Markov models
vEstimate downtime and availability for the Core System and each Individual Interface
Resysnet will perform this analysis in compliance with Telcordia's SR-1171, Methods and Procedures for System Reliability Analysis.This industry standard uses traditional RBDs which are then solved by complex Markov models.These probabilistic models will calculate quantitative attributes of your system that could only be determined empirically if a large volume of system specific data was available.We will work with your Hardware and Software Engineering Design Teams to perform this analysis.The end result is an indepth qualitative report estimating the reliability and availability of your system.
FAILURE MODES EFFECTS ANALYSIS (FMEA)
Hardware designs have become increasingly complex.Without any meaningful uses of reliability data, the effects of failures on ones product can be easily overlooked. FMEA takes the output of ones reliability prediction and determines the component criticality of each failure mode to help assess risk.After a review with Engineering, failure analysis of failure modes can lead to optimized design changes to improve reliability, maintainability, and availability.Typically, this exercise yields the top or most critical failure modes in a design and together with Engineering, risk mitigation steps are planned.
FMEA will satisfy the following objectives:
vA comprehensive identification and evaluation of SPOFs and critical component failure modes (CPU failing, memory structure, single bit errors, ECC, FPGA failure, power components, etc.)
vDetermination of the priority for addressing each failure mode with respect to the system’s correct function or performance and the impact on the process concerned
vClassification of identified failure modes according to relevant characteristics, including their ease of detection, capability to be diagnosed, testability, compensating and operating provisions (repair, maintenance, logistics, etc.)
vIdentification of system functional failures and estimation of measures of the severity and probability of failure
vSuggestions for design improvement to help mitigate the effects of failure
Cumulative failure rates used for criticality assessment and occurrence probabilities are based on Telcordia’s SR-332, Reliability Prediction Procedure for Electronic Equipment.This allows one complete set of work from prediction, to system analysis, and through to FMEA.
vHighly Accelerated Life Testing (HALT)
vHighly Accelerated Stress Screen (HASS)
vReliability Demonstration Testing (RDT)
vOn-going Demonstration Testing (ORT)
vFailure Reporting and Corrective Action Systems (FRACAS) Development