The (a,b,0) class

This post introduces the class of discrete distributions called the (a,b,0) class.

A counting distribution is a discrete random variable that takes on values of non-negative integers 0,1,2, … Examples include the Poisson distribution, the binomial distribution and the negative binomial distribution (see here for a discussion). These distributions are potential models for the number of occurrences for some random events of interest, e.g. the number of losses in actuarial applications. The discussion below shows that the notion of (a,b,0) class is another way to describe the big three counting distributions of Poisson, binomial and negative binomial. The notion of (a,b,1) class is a generalization of the (a,b,0) class and is defined in a subsequent post.

The (a,b,0) Class

The (a,b,0) class is at heart a recursive algorithm to generate probabilities. Let’s fix some notations. Let N be a counting random variable. For each k=0,1,2,3,\cdots, let P_k=P(N=k). The counting random variable N is said to be a member of the (a,b,0) class of distributions if for some constants a and b the following recursive relation holds

    \displaystyle (1) \ \ \ \ \ \frac{P_k}{P_{k-1}}=a + \frac{b}{k} \ \ \ \ \ \ \ \ \ \ \ \ \ k=1,2,3,\cdots

Note that the recursive relation (1) generates all the probabilities P_k for all integers k starting at 1. The relation (1) does not account for P_0. Does that mean that the initial probability P_0 can be any arbitrary probability value? Note that the recursive relation (1) means that each P_k is ultimately expressed in terms of P_0.

    P_0=P_0

    \displaystyle P_1=(a+b) P_0

    \displaystyle P_2=\biggl(a+\frac{b}{2} \biggr) (a+b) P_0

    \cdots

    \displaystyle P_k=\biggl(a+\frac{b}{k} \biggr) \biggl(a+\frac{b}{k-1} \biggr) \cdots \biggl(a+\frac{b}{2} \biggr) (a+b) P_0

    \cdots

When a and b are fixed, the value of P_0 is also fixed since the probabilities must sum to 1. In fact P_0 is the following value.

    ……….\displaystyle P_0 \biggl[ 1+(a+b)+\biggl(a+\frac{b}{2} \biggr)+\cdots+\biggl(a+\frac{b}{k} \biggr)+\cdots \biggr]=1

    \displaystyle (2) \ \ \ \ \ P_0 =\biggl[ 1+(a+b)+\biggl(a+\frac{b}{2} \biggr)+\cdots+\biggl(a+\frac{b}{k} \biggr)+\cdots \biggr]^{-1}

Thus a member of the (a,b,0) class has two parameters, namely a and b, which completely determine the distribution.

Example 1
As an example, let a=0 and b=\lambda where \lambda>0 is a fixed positive constant. Using (1), we see that

    \displaystyle P_1=\lambda \ P_0
    \displaystyle P_2=\frac{1}{2!} \ \lambda^2 P_0
    \displaystyle P_3=\frac{1}{3!} \ \lambda^3 P_0
    ……..\cdots
    ……..\cdots
    ……..\cdots
    \displaystyle P_n=\frac{1}{n!} \lambda^n P_0
    ……..\cdots
    ……..\cdots
    ……..\cdots

According to (2), P_0=e^{-\lambda}

    \displaystyle \begin{aligned} P_0&=\biggl(1+ \lambda+\frac{1}{2!} \ \lambda^2 +\cdots+\frac{1}{n!} \lambda^n +\cdots \biggr)^{-1} \\&=(e^{\lambda})^{-1}\\&=e^{-\lambda}  \end{aligned}

With P_0=e^{-\lambda}, the probabilities P_n are from a Poisson distribution. Thus, when the parameter a is 0, and the parameter b is a positive constant, the corresponding distribution from the (a,b,0) class is a Poisson distribution.

Only Three Members in the (a,b,0) Class

In essence, the (a,b,0) class has only three members, namely the big 3 discrete distributions – the Poisson distribution, the binomial distribution and the negative binomial distribution, with each distribution represented by a different sign of the parameter a. Using the recursive relation (1), it can be shown that each of the big three distributions belongs to the (a,b,0) class. The following table shows the parameters a and b in the three cases.

Table 1

Distribution Usual Parameters Parameter a Parameter b
Poisson \lambda 0 \lambda
Binomial n and p \displaystyle -\frac{p}{1-p} \displaystyle (n+1) \ \frac{p}{1-p}
Negative binomial r and p 1-p (r-1) \ (1-p)
Negative binomial r and \theta \displaystyle \frac{\theta}{1+\theta} \displaystyle (r-1) \ \frac{\theta}{1+\theta}
Geometric p 1-p 0
Geometric \theta \displaystyle \frac{\theta}{1+\theta} 0

Table 1 shows how to parametrize the three distributions. For example, for the binomial distribution with parameters n (the number of trials) and p (the probability of success), the (a,b,0) parameters are a=-p/(1-p) and b=-(n+1) a. The two rows for negative binomial reflect two different parametrizations. Of course, the geometric distribution is simply a negative binomial distribution when the parameter r=1. Essentially Table 1 consists of three different distributions.

Table 1 works in the opposite direction as well. Any set of (a,b,0) parameters a and b must fit into one of the distributions listed in Table 1. In other words, the recursive relation (1) produces no new counting distribution. Any counting distribution satisfying (1) must be one of the big 3 counting distributions listed in Table 1.

Note that under the recursive relation (1), not all combinations of a and b will make a probability distribution. For example, when both a and b are negative constants, the resulting probabilities P_k are negative for odd k. When a+b<0, the resulting probabilities P_k cannot be reliably positive in all instances. When a+b=0, P_0=1, i.e. the distribution is a point mass at 0. So we would like to restrict the attention on the case where a+b>0.

To echo the point made previously, it is the case that when a+b>0 and when the recursive relation (1) produces a viable probability distribution, the resulting distribution must be one of the three distributions listed in Table 1. This point is not entirely obvious. Any interested reader can see chapter 6 of [1].

Table 1 indicates that the sign of the parameter a determines the form of the (a,b,0) distribution. If a=0, it is a Poisson distribution. If a is negative, it is a binomial distribution. If a is positive, it is a negative binomial distribution.

Examples

We now present a few more examples illustrating the working of the (a,b,0) recursive relation.

Example 2
This example illustrates that knowing three consecutive probabilities of a member of the (a,b,0) class determines the entire distribution. For example, suppose we know that

    P_1=0.0567
    P_2=0.07938
    P_3=0.09261

These three consecutive probabilities produce the following two linear equations of a and b.

    \displaystyle \frac{P_2}{P_1}=\frac{0.07938}{0.0567}=a+\frac{b}{2}
    \displaystyle \frac{P_3}{P_2}=\frac{0.09261}{0.07938}=a+\frac{b}{3}

Solving these two linear equations produces a=0.7 and b=1.4. Since a is positive, this is a negative binomial distribution. The corresponding negative binomial parameters are r=3 and p=0.3. With this information, the (a,b,0) distribution in question is completely determined. The following are the several distributional quantities.

    P_0=0.3^3=0.027
    P_4=(0.7+\frac{1.4}{4}) \ P_3=0.0972405
    \displaystyle E(N)=r \frac{1-p}{p}=3 \frac{0.7}{0.3}=7
    \displaystyle Var(N)=r \frac{1-p}{p^2}=3 \frac{0.7}{0.3^2}=\frac{7}{0.3}=23.3333

Example 3
Actually any three given probabilities determine the entire (a,b,0) distribution. They do not have to be consecutive. Suppose we are given the following probabilities.

    P_1=0.33554432
    P_2=0.29360128
    P_4=0.0458752

Applying the recursive relation (1) produces the following equations.

    \displaystyle \frac{P_2}{P_1}=\frac{0.29360128}{0.33554432}=a+\frac{b}{2}
    \displaystyle \frac{P_3}{P_2}=\frac{P_3}{0.29360128}=a+\frac{b}{3}
    \displaystyle \frac{P_4}{P_3}=\frac{0.0458752}{P_3}=a+\frac{b}{4}

The above 3 equations lead to the following two equations.

    \displaystyle \frac{P_2}{P_1}=\frac{0.29360128}{0.33554432}=a+\frac{b}{2}
    \displaystyle P_4=0.0458752=\biggl(a+\frac{b}{4} \biggr) \biggl(a+\frac{b}{3} \biggr) \ 0.29360128

Of the above two equations, one is a linear equation and one is a quadratic equation. Solving these two equations produces a=-0.25 and b=2.25. Since a is negative, this is a binomial distribution. Using the translation in Table 1 gives the following equations.

    \displaystyle -\frac{p}{1-p}=-0.25 \ \ \ \ \ \ \ \ \ \ \ (n+1) \ \frac{p}{1-p}=2.25

Solving these equations gives n=8 and p=0.2. The (a,b,0) distribution in question is then completely determined.

Factorial Moments

Another distributional quantity that can give insight into the (a,b,0) class is the factorial moment. For any random variable X, its nth factorial moment is

    (3) \ \ \ \ \ \mu_{(n)}=E[X (X-1) (X-2) \cdots (X-(n-1))]

For example, the first three factorial moments are:

    \mu_{(1)}=E[X]

    \mu_{(2)}=E[X (X-1)]

    \mu_{(3)}=E[X (X-1) (X-2)]

For any member of the (a,b,0) class with parameters a and b, the first factorial moment is:

    \displaystyle (4) \ \ \ \ \ \mu_{(1)}=\frac{a+b}{1-a}

The higher (a,b,0) factorial moments can be obtained recursively as follows:

    \displaystyle (5) \ \ \ \ \ \mu_{(j)}=\frac{a j +b}{1-a} \ \mu_{(j-1)} \ \ \ \ \ \ \ j \ge 2

The recursive formula (5) is a good way to determine the raw moments of the member of the (a,b,0) class. For example, the following calculate the second raw moment and the variance of the random variable N, assumed to be a member of the (a,b,0) class with parameters a and b.

    \displaystyle \mu_{(1)}=E[N]=\frac{a+b}{1-a}

    \displaystyle \mu_{(2)}=\frac{2 a +b}{1-a} \ \frac{a+b}{1-a}=\frac{(2a+b) (a+b)}{(1-a)^2}=E[N (N-1)]=E[N^2]-E[N]

    \displaystyle E[N^2]=\frac{(2a+b) (a+b)}{(1-a)^2}+\frac{a+b}{1-a}=\frac{(a+b) (a+b+1)}{(1-a)^2}

    \displaystyle Var(N)=E[N^2]-E[N]^2=\frac{a+b}{(1-a)^2}

One interesting characteristic of the (a,b,0) class is that knowing limited distributional information determines the distribution. Example 2 and Example 3 show that knowing three point masses completely determines the (a,b,0) distribution. The above derivation shows that knowing the mean and the variance also completely determines the (a,b,0) distribution.

Fitting (a,b,0) Distributions

If the (a,b,0) recursive formula in (1) generates no new distributions, why study (a,b,0) class and why not just focus on Poisson, binomial and negative binomial distribution individually? One reason for studying the recursive (a,b,0) formula is that it gives a graphical way to choose an appropriate member of the (a,b,0) class. To see this, rewrite (1) as follows:

    \displaystyle (6) \ \ \ \ \ k \ \frac{P_k}{P_{k-1}}=a k+ b \ \ \ \ \ \ \ \ \ \ \ \ \ k=1,2,3,\cdots

Note that the quantity on the right side of (6) is a linear function of the integers k. If we plot the left hand side quantity of (6) with k on the x-axis, the plot should be a linear one with the slope being the parameter a and the y-intercept being the parameter b (of course assuming it is an (a,b,0) distribution).

The relation (6) is a way to quickly determine whether a given sample is taken from a member of the (a,b,0) class. To do this, calculate the ratio of two consecutive data categories times k. In other words, compute ratio such as the following for values of k:

    \displaystyle (7) \ \ \ \ \ k \ \frac{\hat{P}_k}{\hat{P}_{k-1}}=k \ \frac{n_k}{n_{k-1}} \ \ \ \ \ \ \ \ \ \ \ \ \ k=1,2,3,\cdots

where n_k is the observed frequency for the category k. The ratio of n_k to n_{k-1} multiplied by k is a stand-in for the left hand side of (6). Then plot these values against k. A linear trend that is observed in the graph is evidence that the data in the sample is taken from an (a,b,0) distribution.

The slope of the plotted line gives an indication of which (a,b,0) member to use. If the plot is approximately horizontal, then the Poisson model is appropriate. If the plot is a line with negative slope, then the binomial model is more appropriate. If the plot is approximately a line with positive slope, use the negative binomial model. For this approach to work properly, large observed data set is preferred.

The (a,b,1) Class

It is possible that the (a,b,0) distributions do not adequately describe a random counting phenomenon being observed. For example, the sample data may indicate that the probability at zero may be larger than is indicated by the distributions in the (a,b,0) class. One alternative is to assign a larger value for P_0 and recursively generate the subsequent probabilities P_k for k=2,3,\cdots. The class of the distributions defined by this recursive scheme is called the (a,b,1) class, which is discussed in the next post.

Practice Problems

Practice problems on (a,b,0) class

Practice problems on (a,b,1) class

Reference

  1. Panjer H. H., Wilmot G. E., Insurance Risk Models, Society of Actuaries, Chicago, 1992.

Dan Ma actuarial topics
Dan Ma actuarial
Dan Ma math

Daniel Ma actuarial
Daniel Ma mathematics
Daniel Ma actuarial topics

\copyright 2018 – Dan Ma

4 thoughts on “The (a,b,0) class

  1. Pingback: Practice Problem Set 11 – (a,b,0) class « Practice Problems in Actuarial Modeling

  2. Pingback: The (a,b,1) class | Topics in Actuarial Modeling

  3. Pingback: The (a,b,0) and (a,b,1) classes « Practice Problems in Actuarial Modeling

  4. Pingback: Practice Problem Set 12 – (a,b,1) class « Practice Problems in Actuarial Modeling

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s