The gamma distribution is a probability distribution that is useful in actuarial modeling. Due to its mathematical properties, there is considerable flexibility in the modeling process. For example, since it has two parameters (a scale parameter and a shape parameter), the gamma distribution is capable of representing a variety of distribution shapes and dispersion patterns. This post gives an account of how the distribution arises mathematically and discusses some of its mathematically properties. The next post discusses how the gamma distribution can arise naturally as the waiting time between two events in a Poisson process.
The Gamma Function
From a mathematically point of view, in defining the gamma distribution, the place to start is the gamma function. For any real number , define:
The above improper integral converges for every positive . The proof that the improper integral converges and other basic facts can be found here.
When the integral in has “incomplete” limits, the resulting functions are called incomplete gamma functions. The following are called the upper incomplete gamma function and lower incomplete gamma function, respectively.
The Gamma Probability Density Function
Notice that the integrand in is a positive value for every . Thus the integrand is a density function if it is normalized by .
The function in is a probability density function since the integral is one when it is integrated over the interval . For convenience, let be the random variable having this density function. The density function in only has one parameter, which is (the shape parameter). To add the second parameter, transform the random variable by multiplying a constant. This can be done in two ways. The following are the probability density functions for the random variables and , respectively.
A random variable is said to follow the gamma distribution with shape parameter and scale parameter if is its probability density function (pdf). A random variable is said to follow the gamma distribution with shape parameter and rate parameter if is its pdf.
The parameter is the scale parameter since it is the case that the larger the value, the more spread out the distribution. The parameter is the rate parameter in the family of gamma distribution. The rate parameter is defined as the reciprocal of the scale parameter.
The following figure (Figure 1) demonstrates the role of the shape parameter . With the scale parameter kept at 2, the gamma distribution becomes less skewed as increases.
The following figure (Figure 2) demonstrates the role of the scale parameter . With the shape parameter kept at 2, all the gamma distributions have the same skewness. However, the gamma distributions become more spread out as increases.
There are sevreral important subclasses of the gamma distribution. When the shape parameter , the gamma distribution becomes the exponential distribution with mean or depending on the parametrization. When the shape parameter is any positive integer, the resulting subclass of gamma distribution is called the Erlang distribution. A Chi-square distribution is a gamma distribution with shape parameter and scale parameter where is a positive integer (the degrees of freedom). The gamma density curves in Figure 1 are chi-square distributions. Their degrees of freedom are 2, 4, 6, 10 and 20. Chi-square distribution with 2 degrees of freedom would be an exponential distribution.
Between the two parametrizations presented here, the version with the scale parameter is the more appropriate model in the settings where a parameter is needed for describing the magnitude of the mean and the spread. The parametrization with and is sometimes easier to work with. For example, it is more common in Bayesian analysis where the gamma distribution can be used as a conjugate prior distribution for a parameter that is a rate (e.g. the rate parameter of a Poisson distribution).
Some Distributional Quantities
The gamma distribution is a two-parameter family of distributions. Here’s some of the basic distributional quantities that are of interest.
Probability Density Function (PDF)
Cumulative Distribution Function (CDF)
Mean and Variance
Moment Generating Function
Coefficient of Variation
Coefficient of Skewness
There is no simple closed form for the cumulative distribution function, except for the case of (i.e. the exponential distribution) and the case of being a positive integer (see next post). As a result, the distributional quantities that required solving for in the CDF have no closed form, e.g. median and other percentiles. As stated above, the CDF can be expressed using the incomplete gamma function, which can be estimated numerically. For the distributional quantities with no closed form, either use numerical estimation or use software.
The calculation for some of the distributional quantities is quite straightforward. For example, to calculate any higher moment , simply adjust the integrand to be an appropriate gamma density function. Then the result will be what can be moved outside the integral, as shown in the following.
Once the higher moments are known, some of the other calculations follow. For example, the coefficient of variation is defined by the ratio of the standard deviation to the mean. This ratio is the standardized measure of dispersion of a probability distribution. The coefficient of skewness is the ratio of the third central moment to the third power of the standard deviation, i.e. . see here for a discussion on skewness. The kurtosis is the ratio of the fourth central moment to the fourth power of the standard deviation, i.e. . The excess kurtosis is obtained by subtracting 3 from the kurtosis.
The product of two gamma moment generating functions with the same scale parameter (or rate parameter ) is also an MGF for a gamma distribution. This points to the fact that the independent sum of two gamma distribution (with the same scale parameter or rate parameter) is a gamma distribution. Specifically if follows a gamma distribution with the shape parameter and follows a gamma distribution with shape parameter and that they are independent, then the sum has a gamma distribution with shape parameter .
We now revisit Figure 1 and Figure 2. The skewness of a gamma distribution is driven only by the shape parameter . The gamma skewness is . The higher the , the less skewed the gamma distribution is (or the more symmetric it looks). This is borne out by Figure 1. There is another angle via the central limit theorem that is borne out by Figure 1. The gamma densities with larger value of can also be thought of as the independent sum of many gamma distributions with smaller values. For example, the gamma with and can be regarded as the independent sum of 10 exponential distributions each with mean 2. By the central limit theorem, any gamma distribution with large value of will tend to look symmetric.
In Figure 2, all gamma densities have the same . Thus they all have the same skewness (about 0.707). It is clear that as the scale parameter increases, the densities become more spread out while remaining skewed density curves.
The support of the gamma distribution is the interval . Thus it is plausible model for random quantities that take on positive values, e.g. insurance losses or insurance claim amounts. With the gamma density curves being positively skewed (skewed to the right), the gamma distribution is a good candidate for random quantities that are concentrated more on the lower end of the interval .
Though the gamma distribution is positively skewed, it is considered to have a light (right) tail. The notion of having a light tail or heavy tail is a relative concept. The gamma distribution has a light right tail as compared to the Pareto distribution. The Pareto distribution significantly puts more probability on larger values (the gamma distribution with same mean and variance will put significantly less probabilities on the larger values). In terms of modeling insurance losses, the gamma distribution will be a more suitable model for losses that are not catastrophic in nature.
One tell tale sign of a distribution with a light tail is that all positive moments exist. For the gamma distribution, exists for all positive integer . In fact, the gamma distribution has a property stronger than the mere fact that all moments exist, i.e. it has a moment generating function. So even though the Gamma distribution has a right tail that is infinitely long (it extents out to infinity), the amount of probabilities is almost negligible after some limit (as compared to the Pareto distribution for example). See here for a more detailed discussion on tail weights and the Pareto distribution.
The next post discusses how the gamma distribution can arise naturally as the waiting time between two events in a Poisson process.
Evaluating the Gamma Function
In many calculations discussed in this blog, it is necessary at times to evaluate the gamma function. Of course if the argument is a positive integer, the gamma function is simply the factorial function. Some special values of the gamma function are:
If special values are not known, it is possible to use software to evaluate the gamma function. We demonstrate how it is done using Excel. There is no dedicated function in Excel for evaluating the gamma function. However, Excel has a function for the PDF of a gamma distribution. To evaluate , consider the density function in with parameters , and . We have the following:
The key to evaluate in Excel is to evaluate the gamma PDF with and at the x-value of 1. The following shows the formula for evaluating the PDF.
=GAMMADIST(1, , 1, FALSE)
Then the gamma function can be evaluated by evaluating the following formula in Excel:
=EXP(-1) / GAMMADIST(1, , 1, FALSE)
For example, which is obtained by the following formula in Excel:
=EXP(-1) / GAMMADIST(1, 3.6, 1, FALSE)