# The big 3 claim frequency models

We now turn the attention to discrete distributions. In particular the focus is on counting distributions. These are the discrete distributions that have positive probabilities only on the non-negative integers 0, 1, 2, 3, … One important application is on finding suitable counting distributions for modeling the number of losses or the number claims to an insurer or more generally the number of other random events that are of interest in actuarial applications. From a claim perspective, these counting distributions would be models for claim frequency. Combining frequency models with models for claim severity would provide a more complete picture of the exposure of risks to the insurer than using claim severity alone. This post and several subsequent posts are preparation for the discussion on modeling aggregate losses and claims. Another upcoming topic would be the effect of insurance coverage modifications (e.g. deductibles) on the claim frequency and claim severity.

This post focuses on the three commonly used counting distributions – Poisson, binomial and negative binomial (the big 3). These three distributions are the basis for defining a large class of other counting distributions.

Probability Generating Function

Let $Y$ be a random variable with positive probabilities only on the non-negative integers, i.e. $P(Y=k)$ is positive only for $k=0,1,2,\cdots$. The function $P(Y=k)$ is the probability of the occurrence of the event $Y=k$, i.e. the observed value of the random variable $Y$ is $k$. It is called the probability function of the random variable $Y$ (also called probability mass function). From the probability function, many other distributional quantities can be derived, e.g. mean, variance and higher moments.

We can also elicit information about $Y$ from its generating function. The generating function (or probability generating function) of $Y$ is defined by:

$\displaystyle P_Y(z)=p_0+p_1 z +p_2 z^2 + \cdots=\sum \limits_{j=0}^\infty p_j z^j$

where each $p_j=P(Y=j)$. The generating function $P_Y(z)$ is defined wherever the infinite sum converges. At minimum, $P_Y(z)$ converges for $\lvert z \lvert \le 1$. Some $P_Y(z)$ converges for all real $z$, e.g. when $Y$ has a Poisson distribution (see below).

One reason for paying attention to generating function is that the moments of $Y$ can be generated from $P_Y(z)$. The $n$th moment of $Y$ is derived from the result of taking the $n$th derivative of $P_Y(z)$ and evaluating at $z=1$.

$\displaystyle E[Y (Y-1) (Y-2) \cdots (Y-(n-1)]=P_Y^{(n)}(1)$

The above expectation is said to be a factorial moment. It follows that $E(Y)=P_{Y}^{(1)}(1)$. Since $E[Y (Y-1)]=P_Y^{(2)}(1)$, the second moment is $E(Y^2)=P_{Y}^{(1)}(1)+P_Y^{(2)}(1)$. In general, the $n$th moment $E(Y^n)$ can be expressed in terms of $P_Y^{(k)}(1)$ for all $k \ge n$.

Another application of generating function is that $P_Y(z)$ encodes the probability function $P(Y=k)$, which is obtained by taking the derivatives of $P_Y(z)$ and then evaluated at $z=0$.

$\displaystyle P(Y=n)=\frac{P_{Y}^{(n)}(0)}{n!}$

where $n=0,1,2,3,\cdots$. Another useful property about generating function is that the probability distribution of a random variable is uniquely determined by its generating function. This fundamental property is useful in determining the distribution of an independent sum. The generating function of an independent sum of random variables is simply the product of the individual generating functions. If the product is the generating function of a certain distribution, then the independent sum must be of the same distribution.

For a more detailed discussion on probability generating function, see this blog post in a companion blog.

Poisson Distribution

We now describe the three counting distributions indicated at the beginning of the post. We start with the Poisson distribution. Consider a random variable $X$ that only takes on the non-negative integers. For each $k=0,1,2,\cdots$, let $p_k=P(X=k)$.

The random variable $X$ has a Poisson distribution if its probability function is:

$\displaystyle p_k=\frac{e^{-\lambda} \lambda^k}{k!} \ \ \ \ \ \ k=0,1,2,\cdots$

for some positive constant $\lambda$. This constant $\lambda$ is the parameter of the Poisson distribution in question. It is also the mean and variance of the Poisson distribution. The following is the probability generating function of the Poisson distribution.

$\displaystyle P_X(z)=e^{\lambda \ (z-1)}$

The Poisson generating function is defined for all real numbers $z$. The mean and variance and higher moments can be computed using the generating function.

$E(X)=P_X^{(1)}(1)=\lambda$

$E[X (X-1)]=P_X^{(2)}(1)=\lambda^2$

$E[X^2]=P_X^{(1)}(1)+P_X^{(2)}(1)=\lambda+\lambda^2$

$Var(X)=E(X^2)-E(X)^2=\lambda$

One interesting characteristic of the Poisson distribution is that its mean is the same the variance. From a mathematical standpoint, the Poisson distribution arises from the Poisson process (see a more detailed discussion here). Another discussion of the Poisson distribution is found here.

One useful characteristic of the Poisson distribution is that combining independent Poisson distributions results in another Poisson distribution. Suppose that $X_1,X_2,\cdots,X_n$ are independent Poisson random variables with means $\lambda_1,\lambda_2,\cdots,\lambda_n$, respectively. Then the probability generating function of the sum $X=X_1+X_2+\cdots +X_n$ is simply the product of the individual probability generating functions.

$\displaystyle P_X(z)=\prod \limits_{j=1}^n e^{\lambda_j (z-1)}=e^{\lambda \ (z-1)}$

where $\lambda=\lambda_1+\lambda_2+\cdots +\lambda_n$. The probability generating function of the sum $X$ is the generating function of a Poisson distribution. Thus independent sum of Poisson distributions is a Poisson distribution with parameter being the sum of the individual Poisson parameters.

Another useful property is that of splitting a Poisson distribution. For example, suppose that the number of claims $N$ in a given year follows a Poisson distribution with mean $\lambda$ per year. Also suppose that the claims can be classified into $W$ distinct types such that the probability of a claim being of type $i$ is $p_i$, $i=1,2,\cdots, W$ and such that $p_1+\cdots+p_W=1$. If we are interested in studying the number $N_i$ of claims in a year that are of type $i$, $i=1,2,\cdots,W$, then $N_1,N_2,\cdots,N_W$ are independent Poisson random variables with means $\lambda_1 p_1, \lambda_2 p_2,\cdots,\lambda_W p_W$, respectively. For a mathematical discussion of this Poisson splitting phenomenon, see this blog post in a companion blog.

Binomial Distribution

Consider a series of independent events each of which results in one of two distinct outcomes (one is called Success and the other Failure) in such a way that the probability of observing a Success in a trial is constant across all trials (these are called Bernoulli trials). For a binomial distribution, we are only interested in observing $n$ such trials and count the number of successes in these $n$ trials.

More specifically, let $p$ be the probability of observing a Success in a Bernoulli trial. Let $X$ be the number of Successes observed in $n$ independent trials. Then the random variable $X$ is said to have a binomial distribution with parameters $n$ and $p$.

Note that the random variable $X$ is the independent sum of $X_1,X_2,\cdots,X_n$ where $X_i$ is the number of Success in the $i$th Bernoulli trial. Thus $X_i$ is 1 with probability $p$ and is 0 with probability $1-p$. Its probability generating function would be:

$\displaystyle g(z)=(1-p) z^0 +p z^1=1-p+p z$

As a result, the probability generating function for $X$ would be $g(z)$ raised to $n$.

$\displaystyle P_X(z)=(1-p+p z)^n$

The generating function $P_X(z)$ is defined for all real values $z$. Differentiating $P_X(z)$ twice produces the mean and variance.

$E(X)=n p$

$E[X (X-1)]=n (n-1) p^2$

$Var(X)=n p (1-p)$

By differentiating $P_X(z)$ and evaluating at $z=0$, we obtain the probability function.

$\displaystyle P(X=k)=\frac{P_X^{(k)}(0)}{k!}=\binom{n}{k} p^k (1-p)^{n-k}$

where $k=0,1,2,\cdots,n$.

By taking the product of probability generating functions, it follows that the independent sum $Y=Y_1+Y_2+\cdots+Y_m$, where each $Y_i$ has a binomial distribution with parameters $n_i$ and $p$, has a binomial distribution with parameters $n=n_1+\cdots+n_m$ and $p$. In other words, as long as the probability of success $p$ is identical in the binomial distributions, the independent sum is always a binomial distribution.

Note that the variance of the binomial distribution is less than the mean. Thus the binomial distribution is suitable candidate for modeling frequency for the situations where the sample variance is smaller than the sample mean.

Negative Binomial Distribution

As mentioned above, the Poisson distribution requires that the mean and the variance are equaled. The binomial binomial distribution requires that the variance is smaller than the mean. Thus these two counting distributions are not appropriate in all cases. The negative binomial distribution is an excellent alternative to the Poisson distribution and the binomial distribution, especially in the cases where the observed variance is greater than the observed mean.

The negative binomial naturally arises from the same probability experiment that generates the binomial distribution. Consider a series of independent Bernoulli trials each of which results in one of two distinct outcomes (called success and failure) in such a way that the probability of success $p$ is constant across the trials. Instead of observing the outcomes in a fixed number of trials, we now observe the trials until $r$ number of success have occurred.

As we observe the Bernoulli trials, let’s $Y$ be the number of failures until the $r$th success has occurred. The random variable $Y$ has a negative binomial distribution with parameters $r$ and $p$. The parameter $r$ is necessarily a positive integer and the parameter $p$ is a real number between 0 and 1. The following is the probability function for the random variable $Y$.

\displaystyle \begin{aligned} P(Y=k)&=\binom{r+k-1}{k} \ p^r (1-p)^k \\&=\frac{(r+k-1)!}{k! \ (r-1)!} \ p^r (1-p)^k \ \ \ \ \ \ k=0,1,2,\cdots \\& \end{aligned}

In the above probability function, the parameter $r$ must be a positive integer. The binomial coefficient $\binom{r+k-1}{k}$ is computed by its usual definition. The above probability probability function can be relaxed so that $r$ can be any positive real number. The key to the relaxation is a reformulation of the binomial coefficient.

$\displaystyle \binom{n}{j}=\left\{ \begin{array}{ll} \displaystyle \frac{n (n-1) (n-2) \cdots (n-(j-1))}{j!} &\ n>j-1, j=1,2,3,\cdots \\ \text{ } & \text{ } \\ \displaystyle 1 &\ j=0 \end{array} \right.$

Note that in the above formulation, the $n$ in $\binom{n}{j}$ does not have to be an integer. If $n$ were to be a positive integer, the usual definition $\binom{n}{j}=\frac{n!}{j! (n-j)!}$ would lead to the same calculation. The reformulation is a generalization of the usual binomial coefficient definition.

With the new definition of binomial coefficient, the following is the probability function of the negative binomial distribution in the general case.

$\displaystyle P(Y=k)=\binom{r+k-1}{k} \ p^r (1-p)^k \ \ \ \ \ \ k=0,1,2,\cdots$

The following is the same probability function with the binomial coefficient explicitly written out.

$\displaystyle P(Y=k)=\left\{ \begin{array}{ll} \displaystyle p^r &\ k=0 \\ \text{ } & \text{ } \\ \displaystyle \frac{(k-1+r) \cdots (1+r) r}{k!} \ p^r \ (1-p)^k &\ k=1,2,\cdots \end{array} \right.$

For either of the above versions, the mean and variance are:

$\displaystyle E(Y)=\frac{r (1-p)}{p}$

$\displaystyle Var(Y)=\frac{r (1-p)}{p^2}$

Another formulation of the negative binomial distribution is that it is a Poisson-gamma mixture. The following is the probability function.

$\displaystyle P(Y=k)=\left\{ \begin{array}{ll} \displaystyle \biggl(\frac{1}{\beta+1} \biggr)^r &\ k=0 \\ \text{ } & \text{ } \\ \displaystyle \frac{(k-1+r) \cdots (1+r) r}{k!} \ \biggl(\frac{1}{\beta+1} \biggr)^r \biggl(\frac{\beta}{\beta+1} \biggr)^k &\ k=1,2,\cdots \end{array} \right.$

It is still a 2-parameter discrete distribution. The parameters $r$ and $\beta$ originate from the parameters of the gamma distribution in the Poisson-gamma mixture. The mean and variance are:

$\displaystyle E(Y) = r \ \beta$

$\displaystyle Var(Y)=r \ \beta \ (1+\beta)$

The negative binomial distribution has been discussed at length in blog posts in several companion blogs. For the natural interpretation of negative binomial distribution based on counting the number of failures until the $r$th success, see this blog post. This is an excellent introduction.

For the general version of the negative binomial distribution where the parameter $r$ can be any positive real number, see this blog post.

For the version of negative binomial distribution from a Poisson-Gamma mixture point of view, see this blog post.

This blog post has additional facts about the negative binomial distribution. This blog post summarizes the various versions as well as focusing on the calculation of probabilities.

More Counting Distributions

The three counting distributions – Poisson, binomial and negative binomial – provide a versatile tool kit in modeling the number of random events such as losses to the insured or claims to the insurer. The tool kit can be greatly expanded by modifying these three distributions to generate additional distributions. The new distributions belong to the (a,b,0) and (a,b,1) classes. This topic is discussed in the subsequent posts.

Dan Ma actuarial topics
Dan Ma actuarial
Dan Ma math

Daniel Ma actuarial
Daniel Ma mathematics
Daniel Ma actuarial topics

$\copyright$ 2018 – Dan Ma