Mathematically, the Weibull distribution has a simple definition. It is mathematically tractable. It is also a versatile model. The Weibull distribution is widely used in life data analysis, particularly in reliability engineering. In addition to analysis of fatigue data, the Weibull distribution can also be applied to other engineering problems, e.g. for modeling the so called weakest link model.This post gives an introduction to the Weibull distribution.
Defining the Weibull Distribution
A random variable is said to follow a Weibull distribution if has the following density function
where and are some fixed constants. The notation refers to the exponential function . As defined here, the Weibull distribution is a two-parameter distribution with being the shape parameter and being the scale parameter. The following is the cumulative distribution function (CDF).
Connection with the Exponential Distribution
The Weibull distribution can also arise naturally from the random sampling of an exponential random variable. A better way to view Weibull is through the lens of exponential. Taking an observation from an exponential distribution and raising it to a positive power will result in a Weibull observation. Specifically, the random variable has the same CDF as in if is an exponential random variable with mean . To see this, consider the following:
The idea of the Weibull distribution as a power of an exponential distribution simplifies certain calculation on the Weibull distribution. For example, a raw moment of the Weibull distribution is simply another raw moment of the exponential distribution. For an exponential random variable with mean , the raw moments are (details can be found here):
For the Weibull random variable with parameters and , i.e. where is the exponential random variable with mean , the following shows the mean and higher moments.
With the moments established, several other distributional quantities that are based on moments can also be established. The following shows the variance, skewness and kurtosis.
The notation here is the skewness. The notation is kurtosis. The excess kurtosis is . In some sources, the notation is to denote excess kurtosis. Of course and are the mean and variance, respectively.
Another calculation that is easily accessible for the Weibull distribution is that of the percentiles. It is easy to solve for in the CDF in . For example, to find the median, set the CDF equals to 0.5 and solves for , producing the following.
Another basic and important property to examine is the failure rate. The failure rate of a distribution is the ratio of the density function to its survival function. The following is the failure of the Weibull distribution.
See here for a discussion of the failure rate in conjunction with the exponential distribution. Suppose that the distribution in question is a lifetime distribution (time until termination or death). Then the failure rate can be interpreted as the rate of failure at the next instant given that the life has survived to time .
When the Parameters Vary
The discussion in the previous section might give the impression that all Weibull distributions (when the parameters vary) behave in the same way. We now look at examples showing that as (shape parameter) and/or (scale parameter) vary, the distribution will exhibit markedly different behavior. Note that when , the Weibull distribution is reduced to the exponential distribution.
The following diagram shows the PDFs of the Weibull distribution with , and where in all three cases.
Figure 1 shows the effect of the shape parameter taking on different values while keeping the scale parameter fixed. The effect is very pronounced on the skewness. All three density curves are right skewed. The PDF with (the blue curve) has a very strong right skew. The PDF with (the red curve) is exponential and has, by comparison, a much smaller skewness. The PDF with looks almost symmetric, though there is a clear and small right skew. This observation is borne out by the calculation. The skewness coefficients are (blue curve), (red curve) and (green curve).
Another clear effect of the shape parameter is the thickness of the tail (in this case the right tail). Figure 1 suggests that the PDF with (the blue curve) is higher than the other two density curves on the interval . As a result, the blue curve has more probability mass in the right tail. Thus the blue curve has a thicker tail comparing to the other two PDFs. For a numerical confirmation, the following table compares the probabilities in the right tail.
It is clear from the above table that the Weibull distribution with the blue curve assigns more probabilities to the higher values. The mean of the distribution for blue curve is 2. The right tail (over 5 times the mean) contains 4.23% of the probability mass (a small probability for sure but not negligible). The right tail (over 12.5 times the mean) still has a small probability of 0.00673 that cannot be totally ignored. On the other hand, the Weibull distribution for the green curve has a light tail. The mean of the distribution for the green curve is about 0.89. At (over 4.5 times of its mean), the tail probability is already negligible at . At (over 11 times of its mean), the tail probability is , practically zero.
Let’s compare the density curves in Figure 1 with their failure rates. The following figure shows the failure rates for these three Weibull distributions.
According to the definition in , the following shows the failure rate function for the three Weibull distributions.
The blue curve in Figure 1 () has a decreasing failure rate as shown in Figure 2. The failure rate function is constant for the case of (the exponential case). It is an increasing function for the case of . This comparison shows that the Weibull distribution is particularly useful for engineers and researchers who study the reliability of machines and devices. If the engineers believe that the failure rate is constant, then an exponential model is appropriate. If they believe that the failure rate increases with time or age, then a Weibull distribution with shape parameter is more appropriate. If the engineers believe that the failure rate decreases with time, then a Weibull distribution with shape parameter is more appropriate.
We now compare Weibull distributions with various values for (scale parameter) while keeping the shape parameter fixed. The following shows the density curves for the Weibull distributions with while keeping .
The effect of the scale parameter is to compress or stretch out the standard Weibull density curve, i.e. the Weibull distribution with . For example, the density function for is obtained by stretching out the density curve for the one with . The same overall shape is maintained while the density curve is being stretched or compressed. According to , the mean of the transformed distribution is increased (stretching) or decreased (compressing). For example, as is increased from 1 to 2, the mean has a two-fold increase. As the density curve is stretched, the resulting distribution is more spread out and the peak of the density curve decreases. The overall effect of changing the scale parameter is essentially a change in the scale in the x-axis.
The next example is a computational exercise.
The time until failure (in months) of a semiconductor device has a Weibull distribution with shape parameter and scale parameter .
- Give the density function and the survival function.
- Determine the probability that the device will last at least 500 hours.
- Determine the probability that the device will last at least 600 hours given that it has been running for over 500 hours.
- Find the mean and standard deviation of the time until failure.
- Determine the failure rate function of the Weibull time until failure.
To obtain the density function, the survival function and the failure rate, follow the relationships in , and .
Note that the Weibull failure rate is the ratio of the density function to the survival function. In this case, the failure is an increasing function of . Since is a time scale, then this is a model for machines that wear out over time (see the next section).
The probability that the device will last over 500 hours is . The unconditional probability that the device will last over 600 hours is The conditional probability that the device will last more than 600 hours given that it has lasted more 500 hours is the ratio .
To find the mean and variance, we need to evaluate the gamma function. Using Excel, we obtain the following two values of the gamma function (as shown here):
The mean and standard deviation of the time until failure are:
The Weibull Failure Rates
Looking at the failure rate function indicated in and looking at Figure 2, it is clear that when the shape parameter , the failure rate decreases with time (if the distribution is a model for the time until death of a life). When the shape parameter , the failure rate is constant. When the shape parameter , the failure rate increases with time. As a result, the Weibull family of distribution has a great deal of flexibility for modeling the failures of objects (machines, devices).
When the shape parameter , the failure rate decreases with time. Such a Weibull distribution is a model for infant mortality, or early-life failures. When the shape parameter (or near 1), the failure rate is constant or near constant. The resulting Weibull distribution (an exponential model) is a model for random failures (failures that are independent of age). When the shape parameter , the failure rate increases with time. The resulting Weibull distribution is a model for wear-out failures.
In some applications, it may be necessary to model each phase of a lifetime separately, e.g. the early phase with a Weibull distribution with , the useful phase with a Weibull distribution with close to 1 and the wear-out phase with a Weibull distribution with . The resulting failure rate curve resembles a bathtub curve. The following is an idealized bathtub curve.
The blue part of the bathtub curve is the early phase of the lifetime, which is characterized by decreasing failure rate. This is the early-life period in which the defective products die off and are taken out of the study. The next period is the useful-life period, the red part of the curve, in which the failures are random that are independent of age. In this phase, the failure rate is constant or near constant. The green part of the bathtub curve is characterized by increasing failure rates, which is the wear-out phase of the lifetime being studied.
Weakest Link Model
Another attractiveness of the Weibull model is that it can be used to model the so called the weakest link model. Consider a machine or device that has multiple components. Suppose that the device dies or fails when any one of the components fails. The lifetime of such a machine or device is the time to the first failure. Such a lifetime model is called the weakest link model. It can be shown that under these conditions a Weibull distribution is a good model for the distribution of the lifetime of such a machine or device.
If the time until failure of the individual components are indpendent and identically distributed Weibull random variables, then it follows that the minimum of of the Weibull random variables is also a Weibull random variable. To see this, let be independent and identically distributed Weibull random varaibles. Let and be the parameters for the common Weibull distribution. Let . The following gives the survival function of .
where and . This shows that has a Weibull distribution with shape parameter (as before) and scale parameter . Under the condition that the times to failure for the multiple components are indentically Weibull distributed, the lifetime of the device is also a Weibull model.
The Moment Generating Function
The post is concluded with a comment on the moment generating function for the Weibull distribution. Note that relationship indicates that all positive moments for the Weibull distribution exist. A natural question is on whether the moment generating function (MGF) exists. It turns out that MGF does not always exist for the Weibull distribution. The MGF exists for the Weibull distribution whenever the shape parameter . However the MGF cannot be expressed in terms of any familiar functions. Instead, the Weibull MGF can be expressed as a power series.
To see this, start with the power series for .
Since the positive moments exist for the Weibull distribution, the higher moments from are plugged into the power series. It can be shown that the last series converges when .
When , the power series does not converge. For a specific example, let . Then the term in the series is simplified to , when goes to infinity when . Thus the Weilbull distribution with the shape parameter is an example of a distribution where all the positive moments exist but the MGF does not exist.