# Appendix: Special Analysis Methods

 Appendix C Special Analysis Methods

## Contents

More Resources:
Weibull++ Examples Collection

## Grouped Data Analysis

The grouped data type in Weibull++ is used for tests where there are groups of units having the same time-to-failure, or units are grouped together in intervals, or there are groups of units suspended at the same time. However, you must be cautious in using the different parameter estimation methods because different methods treat grouped data in different ways. ReliaSoft designed Weibull++ to treat grouped data in different ways to maximize the options available to you.

### When Using Rank Regression (Least Squares)

When using grouped data, Weibull++ plots the data point corresponding to the highest rank position in each group. For example, given 3 groups of 10 units, each failing at 100, 200 and 300 hours respectively, the three plotted points will be the end point of each group, or the 10th rank position out of 30, the 20th rank position out of 30 and the 30th rank position out of 30. This procedure is identical to standard procedures for using grouped data, as discussed in Kececioglu [19]. In cases where the grouped data are interval censored, it is assumed that the failures occurred at some time in the interval between the previous and current time to failure. In our example, this would be the same as saying that 10 units have failed in the interval between zero and 100 hours, another 10 units failed in the interval between 100 and 200 hours, and 10 more units failed in the interval from 200 to 300 hours. The rank regression analysis automatically takes this into account. If this assumption of interval failure is incorrect (i.e., 10 units failed at exactly 100 hours, 10 failed at exactly 200 hours and 10 failed at exactly 300 hours), then it is recommended that you enter the data as non-grouped when using rank regression, or select the Ungroup on Regression check box on the Analysis page of the folio's control panel.

### The Mathematics

Median ranks are used to obtain an estimate of the unreliability, $Q({{T}_{j}}),\,\!$ for each failure at a $50%\,\!$ confidence level. In the case of grouped data, the ranks are estimated for each group of failures, instead of each failure. For example, consider a group of 10 failures at 100 hours, 10 at 200 hours and 10 at 300 hours. Weibull++ estimates the median ranks ($Z\,\!$ values) by solving the cumulative binomial equation with the appropriate values for order number and total number of test units. For 10 failures at 100 hours, the median rank, $Z,\,\!$ is estimated by using:

$0.50=\underset{k=j}{\overset{N}{\mathop \sum }}\,\left( \begin{matrix} N \\ k \\ \end{matrix} \right){{Z}^{k}}{{\left( 1-Z \right)}^{N-k}}\,\!$

with:

\begin{align} N=30,\text{ }J=10 \end{align}\,\!

One $Z\,\!$ is obtained for the group, to represent the probability of 10 failures occurring out of 30.

For 10 failures at 200 hours, $Z\,\!$ is estimated by using:

$0.50=\underset{k=j}{\overset{N}{\mathop \sum }}\,\left( \begin{matrix} N \\ k \\ \end{matrix} \right){{Z}^{k}}{{\left( 1-Z \right)}^{N-k}}\,\!$

where:

\begin{align} N=30,\text{ }J=20 \end{align}\,\!

This represents the probability of 20 failures out of 30.

For 10 failures at 300 hours, $Z\,\!$ is estimated by using:

$0.50=\underset{k=j}{\overset{N}{\mathop \sum }}\,\left( \begin{matrix} N \\ k \\ \end{matrix} \right){{Z}^{k}}{{\left( 1-Z \right)}^{N-k}}\,\!$

where:

\begin{align} N=30,\text{ }J=30 \end{align}\,\!

This represents the probability of 30 failures out of 30.

### When Using Maximum Likelihood

When using maximum likelihood methods, each individual time is explicitly used in the calculation of the parameters. Theoretically, there is no difference in the entry of a group of 10 units failing at 100 hours and 10 individual entries of 100 hours. This is inherent in the standard MLE method. In other words, no matter how the data were entered (i.e., as grouped or non-grouped) the results will be identical. However, due to the precision issues during the computation, the grouped and ungrouped data may give slightly different results. When using maximum likelihood, we highly recommend entering redundant data in groups, as this significantly speeds up the calculations.

## ReliaSoft Ranking Method

In probability plotting or rank regression analysis of interval or left censored data, difficulties arise when attempting to estimate the exact time within the interval when the failure actually occurs, especially when an overlap on the intervals is present. In this case, the standard ranking method (SRM) is not applicable when dealing with interval data; thus, ReliaSoft has formulated a more sophisticated methodology to allow for more accurate probability plotting and regression analysis of data sets with interval or left censored data. This method utilizes the traditional rank regression method and iteratively improves upon the computed ranks by parametrically recomputing new ranks and the most probable failure time for interval data.

In the traditional method for plotting or rank regression analysis of right censored data, the effect of the exact censoring time is not considered. One example of this can be seen at the parameter estimation chapter. The ReliaSoft ranking method also can be used to overcome this shortfall of the standard ranking method.

The following step-by-step example illustrates the ReliaSoft ranking method (RRM), which is an iterative improvement on the standard ranking method (SRM). Although this method is illustrated by the use of the two-parameter Weibull distribution, it can be easily generalized for other models.

Consider the following test data:

Number of Items Type Last Inspection Time Table B.1- The Test Data 1 Exact Failure 10 1 Right Censored 20 2 Left Censored 0 30 2 Exact Failure 40 1 Exact Failure 50 1 Right Censored 60 1 Left Censored 0 70 2 Interval Failure 20 80 1 Interval Failure 10 85 1 Left Censored 0 100

### Initial Parameter Estimation

As a preliminary step, we need to provide a crude estimate of the Weibull parameters for this data. To begin, we will extract the exact times-to-failure (10, 40, and 50) and the midpoints of the interval failures. The midpoints are 50 (for the interval of 20 to 80) and 47.5 (for the interval of 10 to 85). Next, we group together the items that have the same failure times, as shown in Table B.2.

Using the traditional rank regression, we obtain the first initial estimates:

\begin{align} & {{\widehat{\beta }}_{0}}= & 1.91367089 \\ & {{\widehat{\eta }}_{0}}= & 43.91657736 \end{align}\,\!

Number of Items Type Last Inspection Table B.2- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures 1 Exact Failure 10 2 Exact Failure 40 1 Exact Failure 47.5 3 Exact Failure 50

Step 1

For all intervals, we obtain a weighted midpoint using:

\begin{align} {{{\hat{t}}}_{m}}\left( \hat{\beta },\hat{\eta } \right)= & \frac{\int_{LI}^{TF}t\text{ }f(t;\hat{\beta },\hat{\eta })dt}{\int_{LI}^{TF}f(t;\hat{\beta },\hat{\eta })dt}, \\ = & \frac{\int_{LI}^{TF}t\tfrac{{\hat{\beta }}}{{\hat{\eta }}}{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{\hat{\beta }-1}}{{e}^{-{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{{\hat{\beta }}}}}}dt}{\int_{LI}^{TF}\tfrac{{\hat{\beta }}}{{\hat{\eta }}}{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{\hat{\beta }-1}}{{e}^{-{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{{\hat{\beta }}}}}}dt} \end{align}\,\!

This transforms our data into the format in Table B.3.

Number of Items Type Last Inspection Time Weighted "Midpoint" Table B.3- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures, Based upon the Parameters $\beta\,\!$ and $\eta\,\!$. 1 Exact Failure 10 2 Exact Failure 40 1 Exact Failure 50 2 Interval Failure 20 80 42.837 1 Interval Failure 10 85 39.169

Step 2

Now we arrange the data as in Table B.4.

Number of Items Time Table B.4- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures, in Ascending Order. 1 10 1 39.169 2 40 2 42.837 1 50

Step 3

We now consider the left and right censored data, as in Table B.5.

Number of items Time of Failure 2 Left Censored t = 30 1 Left Censored t = 70 1 Left Censored t = 100 1 Right Censored t = 20 1 Right Censored t = 60 Table B.5- Computation of Increments in a Matrix Format for Computing a Revised Mean Order Number. 1 10 $2 \frac{\int_0^{10} f_0(t)dt}{F_0 (30)-F_0 (0)}\,\!$ $\frac{\int_0^{10} f_0 (t)dt}{F_0(70)-F_1(0)}\,\!$ $\frac{\int_0^{10} f_0(t)dt}{F_0(100)-F_0(0)}\,\!$ 0 0 1 39.169 $2 \frac{\int_{10}^{30} f_0(t)dt}{F_0(30)-F_0(0)}\,\!$ $\frac{\int_{10}^{39.169} f_0(t)dt}{F_0(70)-F_0(0)}\,\!$ $\frac{\int_{10}^{39.169} f_0(t)dt}{F_0(100)-F_0(0)}\,\!$ $\frac{\int_{20}^{39.169} f_0(t)dt}{F_0(\infty)-F_0(20)}\,\!$ 0 2 40 0 $\frac{\int_{39.169}^{40} f_0(t)dt}{F_0(70)-F_0(0)}\,\!$ $\frac{\int_{39.169}^{40} f_0(t)dt}{F_0(100)-F_0(0)}\,\!$ $\frac{\int_{39.169}^{40} f_0(t)dt}{F_0(\infty)-F_0(20)}\,\!$ 0 2 42.837 0 $\frac{\int_{40}^{42.837} f_0(t)dt}{F_0(70)-F_0(0)}\,\!$ $\frac{\int_{40}^{42.837} f_0(t)dt}{F_0(100)-F_0(0)}\,\!$ $\frac{\int_{40}^{42.837} f_0(t)dt}{F_0(\infty)-F_0(0)}\,\!$ 0 1 50 0 $\frac{\int_{42.837}^{50} f_0(t)dt}{F_0(70)-F_0(0)}\,\!$ $\frac{\int_{42.837}^{50} f_0(t)dt}{F_0(100)-F_0(0)}\,\!$ $\frac{\int_{42.837}^{50} f_0(t)dt}{F_0(\infty)-F_0(0)}\,\!$ 0

In general, for left censored data:

• The increment term for $n\,\!$ left censored items at time $={{t}_{0}},\,\!$ with a time-to-failure of ${{t}_{i}}\,\!$ when ${{t}_{0}}\le {{t}_{i-1}}\,\!$ is zero.
• When ${{t}_{0}}>{{t}_{i-1}},\,\!$ the contribution is:
$\frac{n}{{{F}_{0}}({{t}_{0}})-{{F}_{0}}(0)}\underset{{{t}_{i-1}}}{\overset{MIN({{t}_{i}},{{t}_{0}})}{\mathop \int }}\,{{f}_{0}}\left( t \right)dt\,\!$
or:
$n\frac{{{F}_{0}}(MIN({{t}_{i}},{{t}_{0}}))-{{F}_{0}}({{t}_{i-1}})}{{{F}_{0}}({{t}_{0}})-{{F}_{0}}(0)}\,\!$

where ${{t}_{i-1}}\,\!$ is the time-to-failure previous to the ${{t}_{i}}\,\!$ time-to-failure and $n\,\!$ is the number of units associated with that time-to-failure (or units in the group).

In general, for right censored data:

• The increment term for $n\,\!$ right censored at time $={{t}_{0}},\,\!$ with a time-to-failure of ${{t}_{i}}\,\!$, when ${{t}_{0}}\ge {{t}_{i}}\,\!$ is zero.
• When ${{t}_{0}}<{{t}_{i}},\,\!$ the contribution is:
$\frac{n}{{{F}_{0}}(\infty )-{{F}_{0}}({{t}_{0}})}\underset{MAX({{t}_{0}},{{t}_{i-1}})}{\overset{{{t}_{i}}}{\mathop \int }}\,{{f}_{0}}\left( t \right)dt\,\!$
or:
$n\frac{{{F}_{0}}({{t}_{i}})-{{F}_{0}}(MAX({{t}_{0}},{{t}_{i-1}}))}{{{F}_{0}}(\infty )-{{F}_{0}}({{t}_{0}})}\,\!$

where ${{t}_{i-1}}\,\!$ is the time-to-failure previous to the ${{t}_{i}}\,\!$ time-to-failure and $n\,\!$ is the number of units associated with that time-to-failure (or units in the group).

Step 4

Sum up the increments (horizontally in rows), as in Table B.6.

Number of items Time of Failure 2 Left Censored t=30 1 Left Censored t=70 1 Left Censored t=100 1 Right Censored t=20 1 Right Censored t=60 Sum of row(increment) Table B.6- Increments Solved Numerically, Along with the Sum of Each Row. 1 10 0.299065 0.062673 0.057673 0 0 0.419411 1 39.169 1.700935 0.542213 0.498959 0.440887 0 3.182994 2 40 0 0.015892 0.014625 0.018113 0 0.048630 2 42.831 0 0.052486 0.048299 0.059821 0 0.160606 1 50 0 0.118151 0.108726 0.134663 0 0.361540

Step 5

Compute new mean order numbers (MON), as shown Table B.7, utilizing the increments obtained in Table B.6, by adding the number of items plus the previous MON plus the current increment.

Number of items Time of Failure Sum of row(increment) Mean Order Number Table B.7- Mean Order Numbers (MON) 1 10 0.419411 1.419411 1 39.169 3.182994 5.602405 2 40 0.048630 7.651035 2 42.837 0.160606 9.811641 1 50 0.361540 11.173181

Step 6

Compute the median ranks based on these new MONs as shown in Table B.8.

Time MON Ranks Table B.8- Mean Order Numbers with Their Ranks for a Sample Size of 13 Units. 10 1.419411 0.0825889 39.169 5.602405 0.3952894 40 7.651035 0.5487781 42.837 9.811641 0.7106217 50 11.173181 0.8124983

Step 7

Compute new $\beta \,\!$ and $\eta ,\,\!$ using standard rank regression and based upon the data as shown in Table B.9.

Time Ranks
100.0826889
39.1690.3952894
400.5487781
42.8370.7106217
500.8124983

Step 8 Return and repeat the process from Step 1 until an acceptable convergence is reached on the parameters (i.e., the parameter values stabilize).

### Results

The results of the first five iterations are shown in Table B.10. Using Weibull++ with rank regression on X yields:

Iteration $\beta\,\!$ $\eta\,\!$ Table B.10-The parameters after the first five iterations 1 1.845638 42.576422 2 1.830621 42.039743 3 1.828010 41.830615 4 1.828030 41.749708 5 1.828383 41.717990
${{\widehat{\beta }}_{RRX}}=1.82890,\text{ }{{\widehat{\eta }}_{RRX}}=41.69774\,\!$

The direct MLE solution yields:

${{\widehat{\beta }}_{MLE}}=2.10432,\text{ }{{\widehat{\eta }}_{MLE}}=42.31535\,\!$