# Reliability DOE for Life Tests

Reliability analysis is commonly thought of as an approach to model failures of existing products. The usual reliability analysis involves characterization of failures of the products using distributions such as exponential, Weibull and lognormal. Based on the fitted distribution, failures are mitigated, or warranty returns are predicted, or maintenance actions are planned. However, by adopting the methodology of Design for Reliability (DFR), reliability analysis can also be used as a powerful tool to design robust products that operate with minimal failures. In DFR, reliability analysis is carried out in conjunction with physics of failure and experiment design techniques. Under this approach, Design of Experiments (DOE) uses life data to "build" reliability into the products, not just quantify the existing reliability. Such an approach, if properly implemented, can result in significant cost savings, especially in terms of fewer warranty returns or repair and maintenance actions. Although DOE techniques can be used to improve product reliability and also make this reliability robust to noise factors, the discussion in this chapter is focused on reliability improvement. The robust parameter design method discussed in Robust Parameter Design can be used to produce robust and reliable product.

# Reliability DOE Analysis

Reliability DOE (R-DOE) analysis is fairly similar to the analysis of other designed experiments except that the response is the life of the product in the respective units (e.g., for an automobile component the units of life may be miles, for a mechanical component this may be cycles, and for a pharmaceutical product this may be months or years). However, two important differences exist that make R-DOE analysis unique. The first is that life data of most products are typically well modeled by either the lognormal, Weibull or exponential distribution, but usually do not follow the normal distribution. Traditional DOE techniques follow the assumption that response values at any treatment level follow the normal distribution and therefore, the error terms, , can be assumed to be normally and independently distributed. This assumption may not be valid for the response data used in most of the R-DOE analyses. Further, the life data obtained may either be complete or censored, and in this case standard regression techniques applicable to the response data in traditional DOEs can no longer be used.

Design parameters, manufacturing process settings, and use stresses affecting the life of the product can be investigated using R-DOE analysis. In this case, the primary purpose of any R-DOE analysis is to identify which of the inputs affect the life of the product (by investigating if change in the level of any input factors leads to a significant change in the life of the product). For example, once the important stresses affecting the life of the product have been identified, detailed analyses can be carried out using ReliaSoft's ALTA software. ALTA includes a number of life-stress relationships (LSRs) to model the relation between life and the stress affecting the life of the product.

# R-DOE Analysis of Lognormally Distributed Data

Assume that the life, , for a certain product has been found to be lognormally distributed. The probability density function for the lognormal distribution is:

where represents the mean of the natural logarithm of the times-to-failure and represents the standard deviation of the natural logarithms of the times-to-failure [Meeker and Escobar 1998, Wu 2000, ReliaSoft 2007b]. If the analyst wants to investigate a single two level factor that may affect the life, , then the following model may be used:

where:

- represents the times-to-failure at the th treatment level of the factor
- represents the mean value of for the th treatment
- is the random error term
- The subscript represents the treatment level of the factor with for a two level factor

The model of the equation shown above is analogous to the ANOVA model, , used in the One Factor Designs and General Full Factorial Designs chapters for traditional DOE analyses. Note, however, that the random error term, , is not normally distributed here because the response, , is lognormally distributed. It is known that the logarithmic value of a lognormally distributed random variable follows the normal distribution. Therefore, if the logarithmic transformation of , , is used in the above equation, then the model will be identical to the ANOVA model, , used in the other chapters. Thus, using the logarithmic failure times, the model can be written as:

where:

- represents the logarithmic times-to-failure at the th treatment
- represents the mean of the natural logarithm of the times-to-failure at the th treatment
- represents the standard deviation of the natural logarithms of the times-to-failure

The random error term, , is normally distributed because the response, , is normally distributed. Since the model of the equation given above is identical to the ANOVA model used in traditional DOE analysis, regression techniques can be applied here and the R-DOE analysis can be carried out similar to the traditional DOE analyses. Recall from Two Level Factorial Experiments that if the factor(s) affecting the response has only two levels, then the notation of the regression model can be applied to the ANOVA model. Therefore, the model of the above equation can be written using a single indicator variable, , to represent the two level factor as:

where is the intercept term and is the effect coefficient for the investigated factor. Setting the two equations above equal to each other returns:

The natural logarithm of the times-to-failure at any factor level, , is referred to as the *life characteristic* because it represents a characteristic point of the underlying life distribution. The life characteristic used in the R-DOE analysis will change based on the underlying distribution assumed for the life data. If the analyst wants to investigate the effect of two factors (each at two levels) on the life of the product, then the life characteristic equation can be easily expanded as follows:

where is the effect coefficient for the second factor and is the indicator variable representing the second factor. If the interaction effect is also to be investigated, then the following equation can be used:

In general the model to investigate a given number of factors can be expressed as:

Based on the model equations mentioned thus far, the analyst can easily conduct an R-DOE analysis for the lognormally distributed life data using standard regression techniques. However this is no longer true once the data also includes censored observations. In the case of censored data, the analysis has to be carried out using maximum likelihood estimation (MLE) techniques.

### Maximum Likelihood Estimation for the Lognormal Distribution

The maximum likelihood estimation method can be used to estimate parameters in R-DOE analyses when censored data are present. The likelihood function is calculated for each observed time to failure, , and the parameters of the model are obtained by maximizing the log-likelihood function. The likelihood function for complete data following the lognormal distribution is given as:

where:

- is the total number of observed times-to-failure
- is the life characteristic
- is the time of the th failure

For right censored data the likelihood function [Meeker and Escobar 1998, Wu 2000, ReliaSoft 2007b] is:

where:

- is the total number of observed suspensions
- is the time of th suspension

For interval data the likelihood function [Meeker and Escobar 1998, Wu 2000, ReliaSoft 2007b] is:

where:

- is the total number of interval data
- is the beginning time of the th interval
- is the end time of the th interval

The complete likelihood function when all types of data (complete, right censored and interval) are present is:

Then the log-likelihood function is:

The MLE estimates are obtained by solving for parameters so that:

Once the estimates are obtained, the significance of any parameter, , can be assessed using the likelihood ratio test.

### Hypothesis Tests

Hypothesis testing in R-DOE analyses is carried out using the likelihood ratio test. To test the significance of a factor, the corresponding effect coefficient(s), , is tested. The following statements are used:

The statistic used for the test is the likelihood ratio, . The likelihood ratio for the parameter is calculated as follows:

where:

- is the vector of all parameter estimates obtained using MLE (i.e., ... )
- is the vector of all parameter estimates excluding the estimate of
- is the value of the likelihood function when all parameters are included in the model
- is the value of the likelihood function when all parameters except are included in the model

If the null hypothesis, , is true then the ratio, , follows the chi-squared distribution with one degree of freedom. Therefore, is rejected at a significance level, , if is greater than the critical value .

The likelihood ratio test can also be used to test the significance of a number of parameters, , at the same time. In this case, represents the likelihood value when all parameters to be tested are not included in the model. In other words, would represent the likelihood value for the reduced model that does not contain the parameters under test. Here, the ratio will follow the chi-squared distribution with degrees of freedom if all parameters are insignificant (with representing the number of parameters in the full model). Thus, if , the null hypothesis, , is rejected and it can be concluded that at least one of the parameters is significant.

#### Example

To illustrate the use of MLE in R-DOE analysis, consider the case where the life of a product is thought to be affected by two factors, and . The failure of the product has been found to follow the lognormal distribution. The analyst decides to run an R-DOE analysis using a single replicate of the design. Previous studies indicate that the interaction between and does not affect the life of the product. The design for this experiment can be set up in a Weibull++ DOE folio as shown in the following figure.

The resulting experiment design and the corresponding times-to-failure data obtained are shown next. Note that, although the life data set contains *complete data* and regression techniques are applicable, calculations are shown using MLE. Weibull++ DOE folios use MLE for all R-DOE analysis calculations.

Because the purpose of the experiment is to study two factors without considering their interaction, the applicable model for the lognormally distributed response data is:

where is the mean of the natural logarithm of the times-to-failure at the th treatment combination (), is the effect coefficient for factor and is the effect coefficient for factor . The analysis for this case is carried out in a DOE folio by excluding the interaction from the analysis.

The following hypotheses need to be tested in this example:

1)

This test investigates the main effect of factor . The statistic for this test is:

where represents the value of the likelihood function when all coefficients are included in the model and represents the value of the likelihood function when all coefficients except are included in the model.

2)

This test investigates the main effect of factor . The statistic for this test is:

where represents the value of the likelihood function when all coefficients are included in the model and represents the value of the likelihood function when all coefficients except are included in the model.

To calculate the test statistics, the maximum likelihood estimates of the parameters must be known. The estimates are obtained next.

### MLE Estimates

Since the life data for the present experiment are complete and follow the lognormal distribution, the likelihood function can be written as:

Substituting from the applicable model for the lognormally distributed response data, the likelihood function is:

Then the log-likelihood function is:

To obtain the MLE estimates of the parameters, and , the log-likelihood function must be differentiated with respect to these parameters:

Equating the terms to zero returns the required estimates. The coefficients , and are obtained first as these are required to estimate . Setting :

Substituting the values of , and from the example's experiment design and corresponding data and simplifying:

Thus:

Setting :

Thus:

Setting :

Thus:

Knowing , and , can now be obtained. Setting :

Thus:

Once the estimates have been calculated, the likelihood ratio test can be carried out for the two factors.

### Likelihood Ratio Test

The likelihood ratio test for factor is conducted by using the likelihood value corresponding to the full model and the likelihood value when is not included in the model. The likelihood value corresponding to the full model (in this case ) is:

The corresponding logarithmic value is .
The likelihood value for the reduced model that does not contain factor (in this case ) is:

The corresponding logarithmic value is .
Therefore, the likelihood ratio to test the significance of factor is:

The value corresponding to is:

Assuming that the desired significance level for the present experiment is 0.1, since , cannot be rejected and it can be concluded that factor does not affect the life of the product.

The likelihood ratio to test factor can be calculated in a similar way as shown next:

The value corresponding to is:

Since , is rejected and it is concluded that factor affects the life of the product. The previous calculation results are displayed as the Likelihood Ratio Test Table in the results obtained from the DOE folio as shown next.

## Fisher Matrix Bounds on Parameters

In general, the MLE estimates of the parameters are asymptotically normal. This means that for large sample sizes the distribution of the estimates from the same population would be very close to the normal distribution[Meeker and Escobar 1998]. If is the MLE estimate of any parameter, , then the ()% two-sided confidence bounds on the parameter are:

where represents the variance of and is the critical value corresponding to a significance level of on the standard normal distribution. The variance of the parameter, , is obtained using the Fisher information matrix. For parameters, the Fisher information matrix is obtained from the log-likelihood function as follows:

The variance-covariance matrix is obtained by inverting the Fisher matrix :

Once the variance-covariance matrix is known the variance of any parameter can be obtained from the diagonal elements of the matrix. Note that if a parameter, , can take only positive values, it is assumed that the follows the normal distribution [Meeker and Escobar 1998]. The bounds on the parameter in this case are:

Using we get . Substituting this value we have:

Knowing from the variance-covariance matrix, the confidence bounds on can then be determined.

Continuing with the present example, the confidence bounds on the MLE estimates of the parameters , , and can now be obtained. The Fisher information matrix for the example is:

The variance-covariance matrix can be obtained by taking the inverse of the Fisher matrix :

Inverting returns the following matrix:

Therefore, the variance of the parameter estimates are:

Knowing the variance, the confidence bounds on the parameters can be calculated. For example, the 90% bounds () on can be calculated as shown next:

The 90% bounds on are (considering that can only take positive values):

The standard error for the parameters can be obtained by taking the positive square root of the variance. For example, the standard error for is:

The statistic for is:

The value corresponding to this statistic based on the standard normal distribution is:

The previous calculation results are displayed as MLE Information in the results obtained from the DOE folio as shown next.

In the figure, the Effect corresponding to each factor is simply twice the MLE estimate of the coefficient for that factor. Generally, the value corresponding to any coefficient in the MLE Information table should match the value obtained from the likelihood ratio test (displayed in the Likelihood Ratio Test table of the results). If the sample size is not large enough, as in the case of the present example, a difference may be seen in the two values. In such cases, the value from the likelihood ratio test should be given preference. For the present example, the value of 0.8318 for , obtained from the likelihood ratio test, would be preferred to the value of 0.8313 displayed under MLE information. For details see [Meeker and Escobar 1998].

# R-DOE Analysis of Data Following the Weibull Distribution

The probability density function for the 2-parameter Weibull distribution is:

where is the scale parameter of the Weibull distribution and is the shape parameter [Meeker and Escobar 1998, ReliaSoft 2007b]. To distinguish the Weibull shape parameter from the effect coefficients, the shape parameter is represented as instead of in the remaining chapter.
For data following the 2-parameter Weibull distribution, the life characteristic used in R-DOE analysis is the scale parameter, [ReliaSoft 2007a, Wu 2000]. Since represents life data that cannot take negative values, a logarithmic transformation is applied to it. The resulting model used in the R-DOE analysis for a two factor experiment with each factor at two levels can be written as follows:

where:

- is the value of the scale parameter at the th treatment combination of the two factors
- is the indicator variable representing the level of the first factor
- is the indicator variable representing the level of the second factor
- is the intercept term
- and are the effect coefficients for the two factors
- and is the effect coefficient for the interaction of the two factors

The model can be easily expanded to include other factors and their interactions. Note that when any data follows the Weibull distribution, the logarithmic transformation of the data follows the extreme-value distribution, whose probability density function is given as follows:

where the s follows the Weibull distribution, is the location parameter of the extreme-value distribution and is the scale parameter of the extreme-value distribution. The two equations given above show that for R-DOE analysis of life data that follows the Weibull distribution, the random error terms, , will follow the extreme-value distribution (and not the normal distribution). Hence, regression techniques are not applicable even if the data is complete. Therefore, maximum likelihood estimation has to be used.

### Maximum Likelihood Estimation for the Weibull Distribution

The likelihood function for complete data in R-DOE analysis of Weibull distributed life data is:

where:

- is the total number of observed times-to-failure
- is the life characteristic at the th treatment
- is the time of the th failure

For right censored data, the likelihood function is:

where:

- is the total number of observed suspensions
- is the time of th suspension

For interval data, the likelihood function is:

where:

- is the total number of interval data
- is the beginning time of the th interval
- is the end time of the th interval

In each of the likelihood functions, is substituted based on the equation for as:

The complete likelihood function when all types of data (complete, right and left censored) are present is:

Then the log-likelihood function is:

The MLE estimates are obtained by solving for parameters so that:

Once the estimates are obtained, the significance of any parameter, , can be assessed using the likelihood ratio test. Other results can also be obtained as discussed in Maximum Likelihood Estimation for the Lognormal Distribution and Fisher Matrix Bounds on Parameters.

# R-DOE Analysis of Data Following the Exponential Distribution

The exponential distribution is a special case of the Weibull distribution when the shape parameter is equal to 1. Substituting in the probability density function for the 2-parameter Weibull distribution gives:

where of the *pdf* has been replaced by . Parameter is called the failure rate.[ReliaSoft 2007a] Hence, R-DOE analysis for exponentially distributed data can be carried out by substituting and replacing by in the Weibull distribution.

# Model Diagnostics

Residual plots can be used to check if the model obtained, based on the MLE estimates, is a good fit to the data. Weibull++ DOE folios use standardized residuals for R-DOE analyses. If the data follows the lognormal distribution, then standardized residuals are calculated using the following equation:

For the probability plot, the standardized residuals are displayed on a normal probability plot. This is because under the assumed model for the lognormal distribution, the standardized residuals should follow a normal distribution with a mean of 0 and a standard deviation of 1.

For data that follows the Weibull distribution, the standardized residuals are calculated as shown next:

The probability plot, in this case, is used to check if the residuals follow the extreme-value distribution with a mean of 0. Note that in all residual plots, when an observation, , is censored the corresponding residual is also censored.

# Application Examples

## Using R-DOE to Determine the Best Factor Settings

This example illustrates the use of R-DOE analysis to design reliability into a product by determining the optimal factor settings. An experiment was carried out to investigate the effect of five factors (each at two levels) on the reliability of fluorescent lights (Taguchi, 1987, p. 930). The factors, through , were studied using a design (with the defining relations and ) under the assumption that all interaction effects, except , can be assumed to be inactive. For each treatment, two lights were tested (two replicates) with the readings taken every two days. The experiment was run for 20 days and, if a light had not failed by the 20th day, it was assumed to be a suspension. The experimental design and the corresponding failure times are shown next.

The short duration of the experiment and failure times were probably because the lights were tested under conditions which resulted in stress higher than normal conditions. The failure of the lights was assumed to follow the lognormal distribution.

The analysis results from the Weibull++ DOE folio for this experiment are shown next.

The results are obtained by selecting the main effects of the five factors and the interaction . The results show that factors , , and are active at a significance level of 0.1. The MLE estimates of the effect coefficients corresponding to these factors are , , and , respectively. Based on these coefficients, the best settings for these effects to improve the reliability of the fluorescent lights (by maximizing the response, which in this case is the failure time) are:

- Factor should be set at the higher level of since its coefficient is positive
- Factor should be set at the lower level of since its coefficient is negative
- Factor should be set at the higher level of since its coefficient is positive
- Factor should be set at the lower level of since its coefficient is negative

Note that, since actual factor levels are not disclosed (presumably for proprietary reasons), predictions beyond the test conditions cannot be carried out in this case.

## Using R-DOE and ALTA to Estimate B10 Life

Consider a product whose reliability is thought to be affected by eight potential factors: (temperature), (humidity), (load), (fan-speed), (voltage), (material), (vibration) and (current). Assuming that all interaction effects are absent, a design is used to investigate the eight factors at two levels. The generators used to obtain the design are , , and . The design and the corresponding life data obtained are shown next.

Readings for the experiment are taken every 20 hours and the test is terminated at 200 hours. The life of the product is assumed to follow the Weibull distribution.

The results from Weibull++ for this experiment are shown next.

The results show that only factors and are active at a significance level of 0.1.

Assume that, in terms of the actual units, the level of factor corresponds to a temperature of 333 and the level corresponds to a temperature of 383 . Similarly, assume that the two levels of factor are 1000 and 2000 respectively. From the MLE estimates of the effect coefficients it can be noted that to improve reliability (by maximizing the response) factors and should be set as follows:

- Factor should be set at the lower level of 333 since its coefficient is negative
- Factor should be set at the higher level of 2000 since its coefficient is positive

Now assume that the use conditions for the product for the significant factors, and , are a temperature of 298 and a fan-speed of 3000 respectively. The analysis can be taken a step further to obtain an estimate of the reliability of the product at the use conditions using ReliaSoft's ALTA software. The data is entered into ALTA as shown next.

ALTA allows for modeling of the nature of relationship between life and stress. It is assumed that the relation between life of the product and temperature follows the Arrhenius relation while the relation between life and fan-speed follows the inverse power law relation.[ReliaSoft 2007a] Using these relations, ALTA fits the following model for the data:

Based on this model, the B10 life of the product at the use conditions is obtained as shown next. The Weibull reliability equation is:

Substituting the value of from the ALTA model and the value of as obtained from ALTA, the reliability equation becomes:

Finally, substituting the use conditions (Temp , Fan-Speed ) and the desired reliability value of 90%, the B10 life is obtained:

Therefore, at the use conditions, the B10 life of the product is 225 hours. This result and other reliability metrics can be directly obtained from ALTA.

# Single Factor R-DOE Analyses

Webibull++ DOE folios also allow for the analysis of single factor R-DOE experiments. This analysis is similar to the analysis of single factor designed experiments mentioned in One Factor Designs. In single factor R-DOE analysis, the focus is on discovering whether change in the level of a factor affects reliability and how each of the factor levels are different from the other levels. The analysis models and calculations are similar to multi-factor R-DOE analysis.

## Example

To illustrate single factor R-DOE analysis, consider the data in the table shown next, where 10 life data readings for a product are taken at each of the three levels of a certain factor, .

Factor could be a stress that is thought to affect life or three different designs of the same product, or it could be the same product manufactured by three different machines or operators, etc. The goal of the experiment is to see if there is a change in life due to change in the levels of the factor. The design for this experiment is shown next.

The life of the product is assumed to follow the Weibull distribution. Therefore, the life characteristic to be used in the R-DOE analysis is the scale parameter, . Since factor has three levels, the model for the life characteristic, , is:

where is the intercept, is the effect coefficient for the first level of the factor ( is represented as "A[1]" in Weibull++ DOE folios) and is the effect coefficient for the second level of the factor ( is represented as "A[2]" in Weibull++ DOE folios). Two indicator variables, and are the used to represent the three levels of factor such that:

The following hypothesis test needs to be carried out in this example:

where . The statistic for this test is:

where is the value of the likelihood function corresponding to the full model, and is the likelihood value for the reduced model. To calculate the statistic for this test, the MLE estimates of the parameters must be obtained.

### MLE Estimates

Following the procedure used in the analysis of multi-factor R-DOE experiments, MLE estimates of the parameters are obtained by differentiating the log-likelihood function :

Substituting from the model for the life characteristic and setting the partial derivatives to zero, the parameter estimates are obtained as , , and . These parameters are shown in the MLE Information table in the analysis results, shown next.

### Likelihood Ratio Test

Knowing the MLE estimates, the likelihood ratio test for the significance of factor can be carried out. The likelihood value for the full model, , is the value of the likelihood function corresponding to the model :

The likelihood value for the reduced model, , is the value of the likelihood function corresponding to the model :

Then the likelihood ratio is:

If the null hypothesis, , is true then the likelihood ratio will follow the chi-squared distribution. The number of degrees of freedom for this distribution is equal to the difference in the number of parameters between the full and the reduced model. In this case, this difference is 2. The value corresponding to the likelihood ratio on the chi-squared distribution with two degrees of freedom is:

Assuming that the desired significance is 0.1, since , is rejected, it is concluded that, at a significance of 0.1, at least one of the parameters, or , is non-zero. Therefore, factor affects the life of the product. This result is shown in the Likelihood Ratio Test table in the analysis results.

Additional results for single factor R-DOE analysis obtained from the DOE folio include information on the life characteristic and comparison of life characteristics at different levels of the factor.

### Life Characteristic Summary Results

Results in the Life Characteristic Summary table, include information about the life characteristic corresponding to each treatment level of the factor. If is represented as , then the model for the life characteristic can be written as:

The respective equations for all three treatment levels for a single replicate of the experiment can be expressed in matrix notation as:

where:

Knowing , and , the predicted value of the life characteristic at any level can be obtained. For example, for the second level:

Thus:

The variance for the predicted values of life characteristic can be calculated using the following equation:

where is the variance-covariance matrix for , and . Substituting the required values:

From the previous matrix, . Therefore, the 90% confidence interval () on is:

Since the 90% confidence interval on is:

Results for other levels can be calculated in a similar manner and are shown next.

### Life Comparisons Results

Results under Life Comparisons include information on how life is different at a level in comparison to any other level of the factor. For example, the difference between the predicted values of life at levels 1 and 2 is (in terms of the logarithmic transformation):

The pooled standard error for this difference can be obtained as:

If the covariance between and is taken into account, then the pooled standard error is:

This is the value displayed by the Weibull++ DOE folio. Knowing the pooled standard error the confidence interval on the difference can be calculated. The 90% confidence interval on the difference in (logarithmic) life between levels 1 and 2 of factor is:

Since the confidence interval does not include zero it can be concluded that the two levels are significantly different at . Another way to test for the significance of the difference in levels is to observe the value. The statistic corresponding to this difference is: