# The gamma distribution from the point of view of a Poisson process

In the previous post, the gamma distribution is defined from the gamma function. This post shows that the gamma distribution can arise from a Poisson process.

_______________________________________________________________________________________________

The Poisson Process

Consider an experiment in which events that are of interest occur at random in a time interval. The goal here is to derive two families of random variables, one continuous and one discrete. Starting at time 0, record the time of the occurrence of the first event. Then record the time at which the second random event occurs and so on (these are the continuous random variables). Out of these measurements, we can derive discrete random variables by counting the number of random events in a fixed time interval.

The recording of times of the occurrences of the random events is like placing markings on a time line to denote the arrivals of the random events. We are interested in counting the number of markings in a fixed interval. We are also interested in measuring the length from the starting point to the first marking and to the second marking and so on. Because of this interpretation, the random process discussed here can also describe random events occurring along a spatial interval, i.e. intervals in terms of distance or volume or other spatial measurements.

A Poisson process is a random process described above in which several criteria are satisfied. We show that in a Poisson process, the number of occurrences of random events in a fixed time interval follows a Poisson distribution and the time until the $n$th random event follows a Gamma distribution.

A good example of a Poisson process is the well known experiment in radioactivity conducted by Rutherford and Geiger in 1910. In this experiment, $\alpha$-particles were emitted from a polonium source and the number of $\alpha$-particles were counted during an interval of 7.5 seconds (2,608 many such time intervals were observed). In these 2,608 intervals, a total of 10,097 particles were observed. Thus the mean count per period of 7.5 seconds is 10097 / 2608 = 3.87.

In the Rutherford and Geiger experiment in 1910, a random event is the observation of an $\alpha$-particle. The random events occur at an average of 3.87 per unit time interval (7.5 seconds).

One of the criteria in a Poisson process is that in a very short time interval, the chance of having more than one random event is essentially zero. So either one random event will occur or none will occur in a very short time interval. Considering the occurrence of a random event as a success, there is either a success or a failure in a very short time interval. So a very short time interval in a Poisson process can be regarded as a Bernoulli trial.

The second criterion is that the experiment remains constant over time. Specifically this means that the probability of a random event occurring in a given subinterval is proportional to the length of that subinterval and not on where the subinterval is in the original interval. Any counting process that satisfies this criterion is said to possess stationary increments. For example, in the 1910 radioactivity study, $\alpha$-particles were emitted at the rate of $\lambda=$ 3.87 per 7.5 seconds. So the probability of one $\alpha$-particle emitted from the radioactive source in a one-second interval is 3.87/7.5 = 0.516. Then the probability of observing one $\alpha$-particle in a half-second interval is 0.516/2 = 0.258. For a quarter-second interval, the probability is 0.258/2 = 0.129. So if we observe half as long, it will be half as likely to observe the occurrence of a random event. On the other hand, it does not matter when the quarter-second subinterval is, whether at the beginning or toward the end of the original interval of 7.5 seconds.

The third criterion is that non-overlapping subintervals are mutually independent in the sense that what happens in one subinterval (i.e. the occurrence or non-occurrence of a random event) will have no influence on the occurrence of a random event in another subinterval. Any counting process that satisfies this criterion is said to possess independent increments. In the Rutherford and Geiger experiment, the observation of one particle in one half-second period does not imply that a particle will necessarily be observed in the next half-second.

To summarize, the following are the three criteria of a Poisson process:

Suppose that on average $\lambda$ random events occur in a time interval of length 1.

1. The probability of having more than one random event occurring in a very short time interval is essentially zero.
2. For a very short subinterval of length $\frac{1}{n}$ where $n$ is a sufficiently large integer, the probability of a random event occurring in this subinterval is $\frac{\lambda}{n}$.
3. The numbers of random events occurring in non-overlapping time intervals are independent.

_______________________________________________________________________________________________

The Poisson Distribution

We are now ready to derive the Poisson distribution from a Poisson random process.

Consider random events generated in a Poisson process and let $Y$ be the number of random events observed in a unit time interval. Break up the unit time interval into $n$ non-overlapping subintervals of equal size where $n$ is a large integer. Each subinterval can have one or no random event. The probability of one random event in a subinterval is $\lambda/n$. The subintervals are independent. In other words, the three criteria of a Poisson process described above ensure that the $n$ subintervals are independent Bernoulli trials. As a result, the number of events occurring in these $n$ subintervals is a binomial distribution with $n$ trials and probability of success $\lambda/n$. This binomial distribution is an approximation of the random variable $Y$. The binomial distribution can get more and more granular. The resulting limit is a Poisson distribution, which coincides with the distribution for $Y$. The fact that the Poisson distribution is the limiting case of the binomial distribution is discussed here and here.

It follows that $Y$ follows the Poisson distribution with mean $\lambda$. The following is the probability function.

$\displaystyle P(Y=y)=\frac{e^{-\lambda} \ \lambda^y}{y!} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ y=0,1,2,\cdots$

In the 1910 radioactivity study, the number of $\alpha$-particles observed in a 7.5-second period has a Poisson distribution with the mean of $\lambda=3.87$ particles per 7.5 seconds.

Sometimes it may be necessary to count the random events not in a unit time interval but in a smaller or larger time interval of length $t$. In a sense, the new unit time is $t$ and the new average rate of the Poisson process is then $\lambda t$. Then the idea of taking granular binomial distributions will lead to a Poisson distribution. Let $Y_t$ be the number of occurrences of the random events in a time interval of length $t$. Then $Y_t$ follows a Poisson distribution with mean $\lambda t$. The following is the probability function.

$\displaystyle P(Y_t=x)=\frac{e^{-\lambda t} \ (\lambda t)^y}{y!} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ y=0,1,2,\cdots$

In the 1910 radioactivity study, the number of $\alpha$-particles observed in a 3.75-second period has a Poisson distribution with the mean of $\lambda=3.87/2=1.935$ particles per 3.75 seconds.

_______________________________________________________________________________________________

The Gamma Distribution as Derived from a Poisson Process

With the Poisson process and Poisson distribution properly set up and defined, we can now derive the gamma distribution. As before, we work with a Poisson process in which the random events arrive at an average rate of $\lambda$ per unit time. Let $W_1$ be the waiting time until the occurrence of the first random event, $W_2$ be the waiting time until the occurrence of the second random event and so on. First examine the random variable $W_1$.

Consider the probability $P(W_1>t)$. The event $W_1>t$ means that the first random event takes place after time $t$. This means that there must be no occurrence of the random event in question from time 0 to time $t$. It follows that

$P(W_1>t)=P(Y_t=0)=e^{-\lambda t}$

As a result, $P(W_1 \le t)=1-e^{-\lambda t}$, which is the cumulative distribution of $W_1$. Taking the derivative, the probability density function of $W_1$ is $f_{W_1}(t)=\lambda e^{-\lambda t}$. This is the density function of the exponential distribution with mean $\frac{1}{\lambda}$. Recall that $\lambda$ is the rate of the Poisson process, i.e. the random events arrive at the mean rate of $\lambda$ per unit time. Then the mean time between two consecutive events is $\frac{1}{\lambda}$.

Consider the probability $P(W_2>t)$. The event $W_2>t$ means that the second random event takes place after time $t$. This means that there can be at most one occurrence of the random events in question from time 0 to time $t$. It follows that

$P(W_2>t)=P(Y_t=0)+P(Y_t=1)=e^{-\lambda t}+\lambda t \ e^{-\lambda t}$

As a result, $P(W_2 \le t)=1-e^{-\lambda t}-\lambda t \ e^{-\lambda t}$, which is the cdf of the waiting time $W_2$. Taking the derivative, the probability density function of $W_2$ is $f_{W_2}(t)=\lambda^2 \ t \ e^{-\lambda t}$. This is the density function of the gamma distribution with shape parameter 2 and rate parameter $\lambda$

By the same reasoning, the waiting time until the $n^{th}$ random event, $W_n$, follows a gamma distribution with shape parameter $n$ and rate parameter $\lambda$. The survival function, cdf and the density function are:

$\displaystyle P(W_n>t)=\sum \limits_{k=0}^{n-1} \frac{e^{-\lambda t} \ (\lambda t)^k}{k!}$

$\displaystyle P(W_n \le t)=1-\sum \limits_{k=0}^{n-1} \frac{e^{-\lambda t} \ (\lambda t)^k}{k!}=\sum \limits_{k=n}^{\infty} \frac{e^{-\lambda t} \ (\lambda t)^k}{k!}$

$\displaystyle f_{W_n}(t)=\frac{1}{(n-1)!} \ \lambda^n \ t^{n-1} \ e^{-\lambda t}$

The survival function $P(W_n>t)$ is identical to $P(Y_t \le n-1)$. The equivalence is through the translation: the event $W_n>t$ is equivalent to the event that there can be at most $n-1$ random events occurring from time 0 to time $t$.

Example 1
Let’s have a quick example of calculation for the gamma distribution. In the study by Rutherford and Geiger in 1910, the average rate of arrivals of $\alpha$-particles is 3.87 per 7.5-second period, giving the average rate of 0.516 particles per second. On average, it takes $0.516^{-1}$ = 1.94 seconds to wait for the next particle. The probability that it takes more than 3 seconds of waiting time for the first particle to arrive is $e^{-0.516 (3)}$ = 0.213.

How long would it take to wait for the second particle? On average it would take $2 \cdot 0.516^{-1}$ = 3.88 seconds. The probability that it takes more than 5 seconds of waiting time for the second particle to arrive is

$e^{-0.516 (5)}+0.516 \cdot 5 e^{-0.516 (5)}=3.58 \cdot e^{-2.58}$ = 0.271

_______________________________________________________________________________________________

Remarks

The above discussion shows that the gamma distribution arises naturally from a Poisson process, a random experiment that satisfies three assumptions that deal with independence and uniformity in time. The gamma distribution derived from a Poisson process has two parameters $n$ and $\lambda$ where $n$ is a positive integer and is the shape parameter and $\lambda$ is the rate parameter. If the random variable $W$ follows this distribution, its pdf is:

$\displaystyle f_W(w)=\frac{1}{(n-1)!} \ \lambda^n \ w^{n-1} \ e^{-\lambda w} \ \ \ \ \ \ \ \ \ \ \ \ w>0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)$

The above pdf can be interpreted as the density function for the waiting time until the arrival of the $n$th random event in a Poisson process with an average rate of arrivals at $\lambda$ per unit time. The density function may be derived from an actual Poisson process or it may be just describing some random quantity that has nothing to do with any Poisson process. But the Poisson process interpretation is still useful. One advantage of the Poisson interpretation is that the survival function and the cdf would have an expression in closed form.

$\displaystyle P(W>w)=\sum \limits_{k=0}^{n-1} \frac{e^{-\lambda w} \ (\lambda w)^k}{k!} \ \ \ \ \ \ \ \ \ \ \ \ w>0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (2)$

\displaystyle \begin{aligned} P(W \le w)&=1-\sum \limits_{k=0}^{n-1} \frac{e^{-\lambda w} \ (\lambda w)^k}{k!} \\&=\sum \limits_{k=n}^{\infty} \frac{e^{-\lambda w} \ (\lambda w)^k}{k!} \ \ \ \ \ \ \ \ \ \ \ \ w>0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (3) \end{aligned}

In the Poisson process interpretation, $P(W>w)$ is the probability that the $n$th random event occurs after time $w$. This means that in the interval $(0, w)$, there are at most $n-1$ random events. Thus the gamma survival function is identical to the cdf of a Poisson distribution. Even when $W$ is simply a model of some random quantity that has nothing to do with a Poisson process, such interpretation can still be used to derive the survival function and the cdf of such a gamma distribution.

The gamma distribution described in the density function $(1)$ has a shape parameter that is a positive integer. This special case of the gamma distribution sometimes go by the name Erlang distribution and is important in queuing theory.

In general the shape parameter does not have to be integers; it can be any positive real number. For the more general gamma distribution, see the previous post.

_______________________________________________________________________________________________
$\copyright \ 2016 - \text{Dan Ma}$