# Introducing the gamma distribution

The gamma distribution is a probability distribution that is useful in actuarial modeling. Due to its mathematical properties, there is considerable flexibility in the modeling process. For example, since it has two parameters (a scale parameter and a shape parameter), the gamma distribution is capable of representing a variety of distribution shapes and dispersion patterns. This post gives an account of how the distribution arises mathematically and discusses some of its mathematically properties. The next post discusses how the gamma distribution can arise naturally as the waiting time between two events in a Poisson process.

_______________________________________________________________________________________________

The Gamma Function

From a mathematically point of view, in defining the gamma distribution, the place to start is the gamma function. For any real number $\alpha>0$, define:

$\displaystyle \Gamma(\alpha)=\int_0^\infty t^{\alpha-1} \ e^{-t} \ dt \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (0)$

The above improper integral converges for every positive $\alpha$. The proof that the improper integral converges and other basic facts can be found here.

When the integral in $(0)$ has “incomplete” limits, the resulting functions are called incomplete gamma functions. The following are called the upper incomplete gamma function and lower incomplete gamma function, respectively.

$\displaystyle \Gamma(\alpha, x)=\int_x^\infty t^{\alpha-1} \ e^{-t} \ dt \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)$

$\displaystyle \gamma(\alpha, x)=\int_0^x t^{\alpha-1} \ e^{-t} \ dt \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (2)$

_______________________________________________________________________________________________

The Gamma Probability Density Function

Notice that the integrand in $(0)$ is a positive value for every $t>0$. Thus the integrand $t^{\alpha-1} \ e^{-t}$ is a density function if it is normalized by $\Gamma(\alpha)$.

$\displaystyle f_T(t)=\frac{1}{\Gamma(\alpha)} \ t^{\alpha-1} \ e^{-t} \ \ \ \ \ \ \ t>0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (3)$

The function in $(3)$ is a probability density function since the integral is one when it is integrated over the interval $(0,\infty)$. For convenience, let $T$ be the random variable having this density function. The density function in $(3)$ only has one parameter, which is $\alpha$ (the shape parameter). To add the second parameter, transform the random variable $T$ by multiplying a constant. This can be done in two ways. The following are the probability density functions for the random variables $X=\theta T$ and $Y=\frac{1}{\beta} \ T$, respectively.

$\displaystyle f_X(x)=\frac{1}{\Gamma(\alpha)} \ \biggl(\frac{1}{\theta}\biggr)^\alpha \ x^{\alpha-1} \ e^{-\frac{x}{\theta}} \ \ \ \ \ \ \ x>0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (4a)$

$\displaystyle f_Y(y)=\frac{1}{\Gamma(\alpha)} \ \beta^\alpha \ y^{\alpha-1} \ e^{-\beta y} \ \ \ \ \ \ \ \ \ \ \ \ y>0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (4b)$

A random variable $X$ is said to follow the gamma distribution with shape parameter $\alpha$ and scale parameter $\theta$ if $(4a)$ is its probability density function (pdf). A random variable $Y$ is said to follow the gamma distribution with shape parameter $\alpha$ and rate parameter $\beta$ if $(4b)$ is its pdf.

The parameter $\theta$ is the scale parameter since it is the case that the larger the value, the more spread out the distribution. The parameter $\beta$ is the rate parameter in the family of gamma distribution. The rate parameter is defined as the reciprocal of the scale parameter.

The following figure (Figure 1) demonstrates the role of the shape parameter $\alpha$. With the scale parameter $\theta$ kept at 2, the gamma distribution becomes less skewed as $\alpha$ increases.

Figure 1

The following figure (Figure 2) demonstrates the role of the scale parameter $\theta$. With the shape parameter $\alpha$ kept at 2, all the gamma distributions have the same skewness. However, the gamma distributions become more spread out as $\theta$ increases.

Figure 2

There are sevreral important subclasses of the gamma distribution. When the shape parameter $\alpha=1$, the gamma distribution becomes the exponential distribution with mean $\theta$ or $\frac{1}{\beta}$ depending on the parametrization. When the shape parameter $\alpha$ is any positive integer, the resulting subclass of gamma distribution is called the Erlang distribution. A Chi-square distribution is a gamma distribution with shape parameter $\alpha=\frac{k}{2}$ and scale parameter $\theta=2$ where $k$ is a positive integer (the degrees of freedom). The gamma density curves in Figure 1 are chi-square distributions. Their degrees of freedom are 2, 4, 6, 10 and 20. Chi-square distribution with 2 degrees of freedom would be an exponential distribution.

Between the two parametrizations presented here, the version with the scale parameter is the more appropriate model in the settings where a parameter is needed for describing the magnitude of the mean and the spread. The parametrization with $\alpha$ and $\beta$ is sometimes easier to work with. For example, it is more common in Bayesian analysis where the gamma distribution can be used as a conjugate prior distribution for a parameter that is a rate (e.g. the rate parameter of a Poisson distribution).

_______________________________________________________________________________________________

Some Distributional Quantities

The gamma distribution is a two-parameter family of distributions. Here’s some of the basic distributional quantities that are of interest.

_________________________________________
$X$ is a random variable with the gamma distribution with shape parameter $\alpha$ and scale parameter $\theta$.

$Y$ is a random variable with the gamma distribution with shape parameter $\alpha$ and rate parameter $\beta$.

_________________________________________
Probability Density Function (PDF)

$\displaystyle \begin{array}{ll} \displaystyle f_X(x)=\frac{1}{\Gamma(\alpha)} \ \biggl(\frac{1}{\theta}\biggr)^\alpha \ x^{\alpha-1} \ e^{-\frac{x}{\theta}} & \ \ \ \ \ \ \ \ \ \ x>0 \\ \text{ } & \text{ } \\ \displaystyle f_Y(y)=\frac{1}{\Gamma(\alpha)} \ \beta^\alpha \ y^{\alpha-1} \ e^{-\beta y} & \ \ \ \ \ \ \ \ \ \ \displaystyle y>0 \end{array}$

_________________________________________
Cumulative Distribution Function (CDF)

\displaystyle \begin{aligned} F_X(x)&=\int_0^x \ \frac{1}{\Gamma(\alpha)} \ \biggl(\frac{1}{\theta}\biggr)^\alpha \ t^{\alpha-1} \ e^{-\frac{t}{\theta}} \ dt \\&\text{ } \\&=\frac{\gamma(\alpha, \frac{x}{\theta})}{\Gamma(\alpha)} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (\text{Alternative formulation}) \end{aligned}

\displaystyle \begin{aligned} F_Y(y)&=\int_0^y \ \frac{1}{\Gamma(\alpha)} \ \beta^\alpha \ t^{\alpha-1} \ e^{-\beta t} \ dt \\&\text{ } \\&=\frac{\gamma(\alpha, \beta y)}{\Gamma(\alpha)} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (\text{Alternative formulation}) \end{aligned}

_________________________________________
Mean and Variance

$\displaystyle \begin{array}{ll} \displaystyle E(X)=\alpha \ \theta & \ \ \ \ \ \ \ \ \ \ Var(X)=\alpha \ \theta^2 \\ \text{ } & \text{ } \\ \displaystyle E(Y)=\frac{\alpha}{\beta} & \ \ \ \ \ \ \ \ \ \ \displaystyle Var(Y)=\frac{\alpha}{\beta^2} \end{array}$

_________________________________________
Higher Moments

$\displaystyle E(X^k) = \left\{ \begin{array}{ll} \displaystyle \frac{\theta^k \ \Gamma(\alpha+k)}{\Gamma(\alpha)} &\ \ \ \ \ \ k>-\alpha \\ \text{ } & \text{ } \\ \theta^k \ \alpha (\alpha+1) \cdots (\alpha+k-1) &\ \ \ \ \ \ k \text{ is a positive integer} \end{array} \right.$

$\displaystyle E(Y^k) = \left\{ \begin{array}{ll} \displaystyle \frac{\Gamma(\alpha+k)}{\beta^k \ \Gamma(\alpha)} &\ \ \ \ \ \ k>-\alpha \\ \text{ } & \text{ } \\ \displaystyle \frac{\alpha (\alpha+1) \cdots (\alpha+k-1)}{\beta^k} &\ \ \ \ \ \ k \text{ is a positive integer} \end{array} \right.$

_________________________________________
Moment Generating Function

$\displaystyle \begin{array}{ll} \displaystyle M_X(t)=\biggl(\frac{1}{1-\theta t} \biggr)^\alpha & \ \ \ \ \ \displaystyle t<\frac{1}{\theta} \\ \text{ } & \text{ } \\ \displaystyle M_Y(t)=\biggl(\frac{\beta}{\beta- t} \biggr)^\alpha & \ \ \ \ \ \displaystyle t<\beta \end{array}$

_________________________________________
Mode

$\displaystyle \begin{array}{ll} \displaystyle \theta \ (\alpha-1) & \ \ \ \ \ \alpha>1 \ \text{else } 0 \\ \text{ } & \text{ } \\ \displaystyle \frac{\alpha-1}{\beta} & \ \ \ \ \ \alpha>1 \ \text{else } 0 \end{array}$

_________________________________________
Coefficient of Variation

$\displaystyle CV = \frac{1}{\sqrt{\alpha}}$

_________________________________________
Coefficient of Skewness

$\displaystyle \gamma = \frac{2}{\sqrt{\alpha}}$

_________________________________________
Kurtosis

$\displaystyle \begin{array}{ll} \displaystyle 3+\frac{6}{\alpha} & \ \ \ \ \ \text{Kurtosis} \\ \text{ } & \text{ } \\ \displaystyle \frac{6}{\alpha} & \ \ \ \ \ \text{Excess Kurtosis} \end{array}$

There is no simple closed form for the cumulative distribution function, except for the case of $\alpha=1$ (i.e. the exponential distribution) and the case of $\alpha$ being a positive integer (see next post). As a result, the distributional quantities that required solving for $x$ in the CDF have no closed form, e.g. median and other percentiles. As stated above, the CDF can be expressed using the incomplete gamma function, which can be estimated numerically. For the distributional quantities with no closed form, either use numerical estimation or use software.

The calculation for some of the distributional quantities is quite straightforward. For example, to calculate any higher moment $E(X^k)$, simply adjust the integrand to be an appropriate gamma density function. Then the result will be what can be moved outside the integral, as shown in the following.

\displaystyle \begin{aligned}E(X^k)&=\int_0^\infty x^k \cdot \frac{1}{\Gamma(\alpha)} \ \biggl(\frac{1}{\theta}\biggr)^\alpha \ x^{\alpha-1} \ e^{-\frac{x}{\theta}} \ dx=\int_0^\infty \frac{1}{\Gamma(\alpha)} \ \biggl(\frac{1}{\theta}\biggr)^\alpha \ x^{\alpha+k-1} \ e^{-\frac{x}{\theta}} \ dx \\&=\frac{\theta^k \Gamma(\alpha+k)}{\Gamma(\alpha)} \int_0^\infty \frac{1}{\Gamma(\alpha+k)} \ \biggl(\frac{1}{\theta}\biggr)^{\alpha+k} \ x^{\alpha+k-1} \ e^{-\frac{x}{\theta}} \ dx \\&=\frac{\theta^k \Gamma(\alpha+k)}{\Gamma(\alpha)} \\&=\theta^k \ \alpha (\alpha+1) \cdots (\alpha+k-1) \end{aligned}

Once the higher moments are known, some of the other calculations follow. For example, the coefficient of variation is defined by the ratio of the standard deviation to the mean. This ratio is the standardized measure of dispersion of a probability distribution. The coefficient of skewness is the ratio of the third central moment to the third power of the standard deviation, i.e. $E[(X-\mu)^3]/\sigma^3$. see here for a discussion on skewness. The kurtosis is the ratio of the fourth central moment to the fourth power of the standard deviation, i.e. $E[(X-\mu)^4]/\sigma^4$. The excess kurtosis is obtained by subtracting 3 from the kurtosis.

_______________________________________________________________________________________________

Discussion

The product of two gamma moment generating functions with the same scale parameter $\theta$ (or rate parameter $\beta$) is also an MGF for a gamma distribution. This points to the fact that the independent sum of two gamma distribution (with the same scale parameter or rate parameter) is a gamma distribution. Specifically if $X_1$ follows a gamma distribution with the shape parameter $\alpha_1$ and $X_2$ follows a gamma distribution with shape parameter $\alpha_2$ and that they are independent, then the sum $X_1+X_2$ has a gamma distribution with shape parameter $\alpha_1+\alpha_2$.

We now revisit Figure 1 and Figure 2. The skewness of a gamma distribution is driven only by the shape parameter $\alpha$. The gamma skewness is $\frac{2}{\sqrt{\alpha}}$. The higher the $\alpha$, the less skewed the gamma distribution is (or the more symmetric it looks). This is borne out by Figure 1. There is another angle via the central limit theorem that is borne out by Figure 1. The gamma densities with larger value of $\alpha$ can also be thought of as the independent sum of many gamma distributions with smaller $\alpha$ values. For example, the gamma with $\alpha=10$ and $\theta=2$ can be regarded as the independent sum of 10 exponential distributions each with mean 2. By the central limit theorem, any gamma distribution with large value of $\alpha$ will tend to look symmetric.

In Figure 2, all gamma densities have the same $\alpha=2$. Thus they all have the same skewness (about 0.707). It is clear that as the scale parameter $\theta$ increases, the densities become more spread out while remaining skewed density curves.

The support of the gamma distribution is the interval $(0,\infty)$. Thus it is plausible model for random quantities that take on positive values, e.g. insurance losses or insurance claim amounts. With the gamma density curves being positively skewed (skewed to the right), the gamma distribution is a good candidate for random quantities that are concentrated more on the lower end of the interval $(0,\infty)$.

Though the gamma distribution is positively skewed, it is considered to have a light (right) tail. The notion of having a light tail or heavy tail is a relative concept. The gamma distribution has a light right tail as compared to the Pareto distribution. The Pareto distribution significantly puts more probability on larger values (the gamma distribution with same mean and variance will put significantly less probabilities on the larger values). In terms of modeling insurance losses, the gamma distribution will be a more suitable model for losses that are not catastrophic in nature.

One tell tale sign of a distribution with a light tail is that all positive moments exist. For the gamma distribution, $E(X^k)$ exists for all positive integer $k$. In fact, the gamma distribution has a property stronger than the mere fact that all moments exist, i.e. it has a moment generating function. So even though the Gamma distribution has a right tail that is infinitely long (it extents out to infinity), the amount of probabilities is almost negligible after some limit (as compared to the Pareto distribution for example). See here for a more detailed discussion on tail weights and the Pareto distribution.

The next post discusses how the gamma distribution can arise naturally as the waiting time between two events in a Poisson process.

_______________________________________________________________________________________________

Evaluating the Gamma Function

In many calculations discussed in this blog, it is necessary at times to evaluate the gamma function. Of course if the argument is a positive integer, the gamma function is simply the factorial function. Some special values of the gamma function are:

$\Gamma(\frac{1}{2})=\sqrt{\pi}$

$\Gamma(\frac{3}{2})=\frac{1}{2} \sqrt{\pi}$

$\Gamma(\frac{5}{2})=\frac{3}{4} \sqrt{\pi}$

$\Gamma(\frac{7}{2})=\frac{15}{8} \sqrt{\pi}$

If special values are not known, it is possible to use software to evaluate the gamma function. We demonstrate how it is done using Excel. There is no dedicated function in Excel for evaluating the gamma function. However, Excel has a function for the PDF of a gamma distribution. To evaluate $\Gamma(\alpha)$, consider the density function in $(4b)$ with parameters $\alpha$, $\beta=1$ and $y=1$. We have the following:

$\displaystyle f_Y(1)=\frac{1}{\Gamma(\alpha)} \ 1^\alpha \ 1^{\alpha-1} \ e^{- 1}=\frac{1}{\Gamma(\alpha)}$

$\displaystyle \Gamma(\alpha)=\frac{e^{-1}}{f_Y(1)}$

The key to evaluate $\Gamma(\alpha)$ in Excel is to evaluate the gamma PDF with $\alpha$ and $\beta=1$ at the x-value of 1. The following shows the formula for evaluating the PDF.

=GAMMADIST(1, $\alpha$, 1, FALSE)

Then the gamma function can be evaluated by evaluating the following formula in Excel:

=EXP(-1) / GAMMADIST(1, $\alpha$, 1, FALSE)

For example, $\Gamma(3.6)=3.717023853$ which is obtained by the following formula in Excel:

=EXP(-1) / GAMMADIST(1, 3.6, 1, FALSE)

_______________________________________________________________________________________________
$\copyright \ 2016 - \text{Dan Ma}$