The risk measures of value-at-risk and tail-value-at-risk are discussed in the preceding post. This post extends the preceding post with an algorithm on evaluating the tail-value-at-risk of a mixture distribution with discrete mixing weights.
The preceding post introduces the notions of value-at-risk (VaR) and tail-value-at-risk (TVaR). These are two particular examples of risk measures that are useful for insurance companies and other enterprises in a risk management context. For , VaR at the security level gives the threshold that the probability of a loss more adverse than the threshold is at most . Thus in our context VaR is a percentile of the loss distribution. TVaR is a conditional expected value. At the security level , TVaR is the expected value of the losses given that the losses exceed the threshold VaR.
The preceding post gives several representations of TVaR. It also gives the formula for TVaR for several distributions – exponential, Pareto, normal and lognormal. We now discuss TVaR of a mixture distribution. If a distribution is the mixture of two distributions and if each of the individual distributions has a clear formulation of TVaR, can we mix the two TVaR's? The answer is that we can provided some adjustments are made. The following gives the formula.
Suppose that the loss is a mixture of two distributions represented by the random variables and , with weights and , respectively. Let be the th percentile of the loss , i.e. . Then the tail-value-at-risk at the percent security level is:
The comparison is with formula (3) in the preceding post. That formula shows that TVaR is . In other words, TVaR is VaR plus the mean excess loss function evaluated at . The content within the squared brackets in Formula (a) is a weighted average of the two individual mean excess loss functions with the adjustment of multiplying with the probabilities and . This formula is useful if the (VaR) can be calculated and if the mean excess loss functions are accessible.
We give an example and then show the derivation of Formula (a).
The mean excess loss function of an exponential distribution is constant. Let’s consider the mixture of two exponential distributions. Suppose that losses follow a mixture of two exponential distributions where one distribution has mean 5 (75% weight) and the other has mean 10 (25% weight). Determine the VaR and TVaR at the security level 99%.
First, calculate the 99th percentile of the mixture, which is the solution to the following equation.
By letting , we solve the following equation.
Use the quadratic formula to solve for . Then solve for . The following is the 99th percentile of the loss .
The following gives the TVaR.
Note that the mean excess loss for the first exponential distribution is 5 and for the second one is 10 (the unconditional means). The survival functions and are also easy to evaluate. As long as the percentile of the mixture is calculated, the formula is very useful. In this example, the two exponential parameters are set so that the calculation of percentiles uses the quadratic formula. If the parameters are set differently, then we can use software to evaluate the required percentile.
Deriving the formula
Suppose that is the mixture of , with weight , and , with weight . The density function for is and the density function for is . The density function of is then . We derive from the basic definition of TVaR. Let be the th percentile of .
The formula derived here is for mixtures for two distributions. It is straightforward to extend it for mixtures of any finite-mixture.
Practice problems are available in the companion blog to reinforce the concepts of value-at-risk and tail-value-at-risk. Practice Problems 10-G and 10-H in that link are for TVaR of mixtures.
The notion of mixtures is discussed in this previous post. Many probability distributions useful for actuarial modeling are mixture distributions. The previous post touches on some examples – negative binomial distribution (a Poisson-Gamma mixture), Pareto distribution (an exponential-gamma mixture) and the normal-normal mixture. In this post we present additional examples. We discuss the following examples.
- Poisson-Gamma mixture = Negative Binomial.
- Normal-Normal mixture = Normal.
- Exponential-Gamma mixture = Pareto.
- Exponential-Inverse Gamma mixture = Pareto.
- Gamma-Gamma mixture = Generalized Pareto.
- Weibull-Exponential mixture = Loglogistic.
- Gamma-Geometric mixture = Exponential.
- Normal-Gamma mixture = Student t.
The first three examples are discussed in the previous post. We discuss the remaining examples in this post.
The Pareto Family
Examples 3 and 4 show that Pareto distributions are mixtures of exponential distributions with either gamma or inverse gamma mixing weights. In Example 3, is an exponential distribution with being a rate parameter. When follows a gamma distribution, the resulting mixture is a (Type I Lomax) Pareto distribution. In Example 4, is an exponential distribution with being a scale parameter. When follows an inverse gamma distribution, the resulting mixture is also a (Type I Lomax) Pareto distribution.
As a mixture, Example 5 is like Example 3, except that it is a gamma-gamma mixture resulting in a generalized Pareto distribution. Example 3 has been discussed in the previous post. We now discuss Example 4 and Example 5.
The following gives the cumulative distribution function (CDF) and survival function of the conditional random variable .
The random parameter follows an inverse gamma distribution with parameters and . The following is the pdf of :
We show that the unconditional survival function for is the survival function for the Pareto distribution with parameters (shape parameter) and (scale parameter).
Note that the the integrand in the last integral is a density function for an inverse gamma distribution. Thus the integral is 1 and can be eliminated. The result that remains is the survival function for a Pareto distribution with parameters and . The following gives the CDF and density function of this Pareto distribution.
See here for further information on Pareto Type I Lomax distribution.
Conditional on , the following is the density function of .
The following is the density function of the random parameter .
The following gives the unconditional density function for .
Any distribution that has a density function described above is said to be a generalized Pareto distribution with the parameters , and . Its CDF cannot be written in closed form but can be expressed using the incomplete beta function.
The moments can be easily derived for the generalized Pareto distribution but on a limited basis. Since it is a mixture distribution, the unconditional mean is the weighted average of the conditional means.
Note that has a simple expression when .
When the parameter , the conditional distribution for is an exponential distribution. Then the situation reverts back to Example 3, leading to a Pareto distribution. Thus the Pareto distribution is a special case of the generalized Pareto distribution. Both the Pareto distribution and the generalized Pareto distribution have thicker and longer tails than the original conditional gamma distribution.
It turns out that the F distribution is also a special case of the generalized Pareto distribution. The F distribution with and degrees of freedom is the generalized Pareto distribution with parameters , and . As a result, the following is the density function.
Another way to generate the F distribution is from taking a ratio of two chi-squared distributions (see Theorem 9 in this previous post). Of course, there is no need to use the explicit form of the density function of the F distribution. In a statistical application, the F distribution is accessed using tables or software.
The Loglogistic Distribution
The loglogistic distribution can be derived as a mixture of Weillbull distribution with exponential mixing weights.
The following gives the conditional survival function for and the exponential mixing weight.
The following gives the unconditional survival function and CDF of as well as the PDF.
Any distribution that has any one of the above three distributional quantities is said to be a loglogistic distribution with shape parameter and scale parameter .
One interesting point about loglogistic distribution that an inverse loglogistic distribution is another loglogistic distribution. Suppose that has a loglogistic distribution with shape parameter and scale parameter . Let . Then has a loglogistic distribution with shape parameter and scale parameter .
The above is a survival function for the loglogistic distribution with the desired parameters. Thus there is no need to specially call out the inverse loglogistic distribution.
In order to find the mean and higher moments of the loglogistic distribution, we take the approach of identifying the conditional Weibull means and the weight these means by the exponential mixing weights. Note that the parameter in the conditional CDF is not a scale parameter. The Weibull distribution in this conditional CDF is equivalent to a Weibull distribution with shape parameter and scale parameter . According to formula (4) in this previous post, the th moment of this Weillbull distribution is
The following gives the unconditional th moment of the Weibull-exponential mixure.
The range follows from the fact that the arguments of the gamma function must be positive. Thus the th moments of the loglogistic distribution are limited by its shape parameter . If , then does not exist. For a larger , more moments exist but always a finite number of moments. This is an indication that the loglogistic distribution has a thick (right) tail. This is not surprising since mixture distributions (loglogistic in this case) tend to have thicker tails than the conditional distributions (Weibull in this case). The thicker tail is a result of the uncertainty in the random parameter in the conditional distribution (the Weibull in this case).
Another Way to Obtain Exponential Distribution
We now consider Example 7. The following is a precise statement of the gamma-geometric mixture.
The conditional gamma distribution has an uncertain shape parameter that can take on positive integers. The parameter follows a geometric distribution. Here’s the ingredients that go into the mixture.
The following is the unconditional probability density function of .
The above density function is that of an exponential distribution with rate parameter .
Student t Distribution
Example 3 (discussed in the previous post) involves a normal distribution with a random mean. Example 8 involves a normal distribution with mean 0 and an uncertain variance, which follows a gamma distribution such that the two gamma parameters are related to a common parameter , which will be the degrees of freedom of the student t distribution. The following is a precise description of the normal-gamma mixture.
The following gives the ingredients of the normal-gamma mixture. The first item is the conditional density function of given . The second is the density function of the mixing weight .
The following calculation derives the unconditional density function of .
The above density function is in terms of the two parameters and . In the assumptions, the two parameters are related to a common parameter such that and . The following derivation converts to the common .
The above density function is that of a student t distribution with degrees of freedom. Of course, in performing test of significance, the t distribution is accessed by using tables or software. A usual textbook definition of the student t distribution is the ratio of a normal distribution and a chi-squared distribution (see Theorem 6 in this previous post.
This post discusses another way to generate new distributions from old, that of mixing distributions. The resulting distributions are called mixture distributions.
What is a Mixture?
First, let’s start with continuous mixture. Suppose that is a continuous random variable with probability density function (pdf) where is a parameter in the pdf. There may be other parameters in the distribution but they are not relevant at the moment (e.g. these other parameters may be known constants). Suppose that the parameter is an uncertain quantity and is a random variable with pdf (if is a continuous random variable) or with probability function (if a discrete random variable). Then taking the weighted average of with or as weight produces a mixture distribution. The following would be pdf of the resulting mixture distribution.
Thus a continuous random variable is said to be a mixture (or has a mixture distribution) if its probability density function is a weighted average of a family of pdfs where the weight is the density function or probability function of the random parameter . The random variable is said to be the mixing random variable and its pdf or probability function is called the mixing weight.
Another definition of mixture distribution is that the cumulative distribution function (cdf) of the random variable is the weighted average of a family of cumulative distribution functions indexed by the mixing random variable .
The idea of discrete mixture is similar. A discrete random variable is said to be a mixture if its probability function or cumulative distribution function is a weighted average of a family of probability functions or cumulative distributions indexed by the mixing random variable . The mixing weight can be discrete or continuous. The following shows the probability function and the cdf of a discrete mixture distribution.
When the mixture distribution is a weighted average of finitely many distributions, it is called a -point mixture where is the number of distributions. Suppose that there are distributions with pdfs
or probability functions
with mixing probabilities where the sum of the is 1. Then the following gives the pdf or the probability function of the mixture distribution.
The cdf for the -point mixture is similarly obtained by weighting the respective conditional cdfs as in (4b).
Once the pdf (or probability function) or cdf of a mixture is established, the other distributional quantities can be derived from the pdf or cdf. Some of the distributional quantities can be obtained by taking weighted average of the corresponding conditional counterparts. For example, the following gives the survival function and moments of a mixture distribution. We assume that the mixing weight is continuous. For discrete mixing weight, simply replace the integral with summation.
Once the moments are obtained, all distributional quantities that are based on moments can be evaluated, calculations such as variance, skewness, and kurtosis. Note that these quantities are not the weighted average of the conditional quantities. For example, variance of a mixture is not the weighted average of the variance of the conditional distributions. In fact, the variance of a mixture has two components.
The relationship in (7) is called the law of total variance, which is the proper way of computing the unconditional variance . The first component is called the expected value of conditional variances, which is the weighted average of the conditional variances. The second component is called the variance of the conditional means, which represents the additional variance as a result of the uncertainty in the parameter . If there is a great deal of variation among the conditional mean , the variation will be reflected in through the second component . This will be further illustrated in the examples below.
Some of the examples discussed below have gamma distribution as mixing weights. See here for basic information on gamma distribution.
A natural interpretation of mixture is that of the uncertain parameter in the conditional random variable describes an individual in a large population. For example, the parameter describes a certain characteristics across the units in a population. In this section, we describe the idea of mixture in an insurance setting. The example is to mix Poisson distributions with a gamma distribution as mixing weight. We will see that the resulting mixture is a negative binomial distribution, which is more dispersed than the conditional Poisson distributions.
Consider a large group of insured drivers for auto collision coverage. Suppose that the claim frequency in a year for an insured driver has a Poisson distribution with mean . The conditional probability function for the number of claims in a year for an insured driver is:
The mean number of claims in a year for an insured driver is . The parameter reflects the risk characteristics of an insured driver. Since the population of insured drivers is large, there is uncertainty in the parameter . Thus it is more appropriate to regard as a random variable in order to capture the wide range of risk characteristics across the individuals in the population. As a result, the above probability function is not unconditional, but, rather, a conditional probability function of .
What about the marginal (unconditional) probability function of ? Suppose that the pdf of has a gamma distribution with the following pdf:
where and are known parameters of the gamma distribution. Then the unconditional pdf of is the weighted average of the conditional Poisson distribution.
Note that the integral in the 4th step is 1 since the integrand is a gamma density function. The probability function at the last step is that of a negative binomial distribution. If the parameter is a positive integer, then the following gives the probability function of after simplifying the expression with gamma function.
This probability function can be further simplified as the following:
where . This is one form of a negative binomial distribution. The mean is and the variance is . The variance of the negative binomial distribution is greater than the mean. In a Poisson distribution, the mean equals the variance. Thus the unconditional claim frequency is more dispersed than its conditional distributions. This is a characteristic of mixture distributions. The uncertainty in the parameter variable has the effect of increasing the unconditional variance of the mixture distribution of . Recall that the variance of a mixture distribution has two components, the weighted average of the conditional variances and the variance of the conditional means. The second component represents the additional variance introduced by the uncertainty in the parameter .
We now further illustrate the notion of mixture with a few more examples. Many familiar distributions are mixture distribution. The negative binomial distribution is a mixture of Poisson distributions with gamma mixing weight as discussed above. The Pareto distribution, more specifically Pareto Type I Lomax, is a mixture of exponential distributions with gamma mixing weight (see Example 2 below). Example 3 discusses the normal-normal mixture. Example 1 demonstrates numerical calculation involving a finite mixture.
Suppose that the size of an auto collision claim from a large group of insured drivers is a mixture of three exponential distributions with means 5, 8 and 10 (with respective weights 0.75, 0.15 and 0.10, respectively). Discuss the mixture distribution.
The pdf and cdf are the weighted averages of the respective exponential quantities.
For a randomly selected claim from this population of insured drivers, what is the probability that it exceeds 10? The answer is . The pdf and cdf of the mixture will allow us to derive other distributional quantities such as moments and then using the moments to derive skewness and kurtosis. The moments for exponential distribution has a closed form. Then the moments of the mixture distribution is simply the weighted average of the exponential moments.
where is a positive integer. The following evaluate the first four moments.
The variance of is . The three conditional exponential variances are 25, 64 and 100. The weighted average of these would be 38.35. Because of the uncertainty resulting from not knowing which exponential distribution the claim is from, the unconditional variance is larger than 38.35.
The skewness of a distribution is the third central moments and the kurtosis is defined as the fourth central moment. Each of them can be expressed in terms of the raw moments up to the third or fourth raw moment.
Note that and . The expressions on the right hand side are in terms of the raw moments up to . Plugging in the raw moments produces the skewness and kurtosis . The excess kurtosis is then 11.0097 (subtracting 3 from the kurtosis).
The skewness and excess kurtosis of an exponential distribution are always 2 and 6, respectively. One take way is that skewness and kurtosis of a mixture is not the weighted average of the conditional counterparts. In this particular case, the mixture is more skewed than the individual exponential distributions. Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution (the kurtosis of a normal distribution is 3). Since the excess kurtosis for exponential distributions is 6, this mixture distribution is considered to be heavy tailed and to have higher likelihood of outliers.
Example 2 (Exponential-Gamma Mixture)
The Pareto distribution (Type I Lomax) is a mixture of exponential distributions with gamma mixing weight. Suppose has the exponential pdf , where , conditional on the parameter . Suppose that the pdf of has a gamma distribution with the following pdf:
Then the following gives the unconditional pdf of the random variable .
Example 3 (Normal-Normal Mixture)
Conditional on , consider a normal random variable with mean and variance where is known. The following is the conditional density function of .
Suppose that the parameter is normally distributed with mean and variance (both known parameters). The following is the density function of .
Determine the unconditional pdf of .
The expression in the exponent has the following equivalent expression.
Continuing the derivation:
Note that the integrand in the integral in the third line is the density function of a normal distribution with mean and variance . Hence the integral is 1. The last expression is the unconditional pdf of , repeated as follows.
The above is the pdf of a normal distribution with mean and variance . Thus the mixing normal distribution with mean and variance with the mixing weight being normally distributed with mean and variance produces a normal distribution with mean (same mean as the mixing weight) and variance (sum of the conditional variance and the mixing variance).
The mean of the conditional normal distribution is uncertain. When the mean follows a normal distribution with mean , the mixture is a normal distribution that centers around , however, with increased variance . The increased variance of the unconditional distribution reflects the uncertainty of the parameter .
Mixture distributions can be used to model a statistical population with subpopulations, where the conditional density functions are the densities on the subpopulations, and the mixing weights are the proportions of each subpopulation in the overall population. If the population can be divided into finite number of homogeneous subpopulations, then the model would be a finite mixture as in Example 1. In certain situations, continuous mixing weights may be more appropriate (e.g. Poisson-Gamma mixture).
Many other familiar distributions are mixture distributions and are discussed in the next post.