Introducing the beta function

The gamma distribution is mathematically defined from the gamma function. This post gives a brief introduction to the beta function. The goal is to establish one property that is the basis for defining the beta distribution.

_______________________________________________________________________________________________

The Beta Function

For any positive constants $a$ and $b$, the beta function is defined to be the following integral:

$\displaystyle B(a,b)=\int_0^1 t^{a-1} \ (1-t)^{b-1} \ dt \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (0)$

The beta function can be evaluated directly if the parameters $a$ and $b$ are not too large. For example, $B(3,2)$ is the integral $\displaystyle \int_0^1 t^2 (1-t) \ dt$, which is $1/12$. Evaluating $(0)$ in a case by case basis does not shed light on the beta function. Direct calculation can also be cumbersome (e.g. for large parameters that are integers) or challenging (e.g. for parameters $a$ and $b$ that are fractional). It turns out that the evaluation of the beta function $B(a,b)$ is based on the gamma function.

_______________________________________________________________________________________________

Connection to the Gamma Function

The remainder of the post is to establish the following value of the beta function:

$\displaystyle B(a,b)=\int_0^1 t^{a-1} \ (1-t)^{b-1} \ dt=\frac{\Gamma(a) \ \Gamma(b)}{\Gamma(a+b)} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)$

To start the proof of $(1)$, let $X$ and $Y$ be two independent random variables such that $X$ follows a gamma distribution with shape parameter $a$ and rate parameter $\beta$ and that $Y$ follows a gamma distribution with shape parameter $b$ and rate parameter $\beta$. It does not matter what $\beta$ is, as long as it is the rate parameter for both $X$ and $Y$. Then the sum $S=X+Y$ has a gamma distribution with shape parameter $a+b$ and rate parameter $\beta$. The following is the density function for $S=X+Y$.

$\displaystyle f_S(s)=\frac{1}{\Gamma(a+b)} \ \beta^{a+b} \ s^{a+b-1} \ e^{-\beta s} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (2)$

The density function of $S=X+Y$ can also be derived from the convolution formula using the density functions of $X$ and $Y$ as follows:

\displaystyle \begin{aligned} f_S(s)&=\int_0^s f_Y(s-x) \ f_X(x) \ dx \ \ \ \ \ \text{(convolution)} \\&=\int_0^s \frac{1}{\Gamma(b)} \ \beta^{b} \ (s-x)^{b-1} \ e^{-\beta (s-x)} \ \frac{1}{\Gamma(a)} \ \beta^{a} \ x^{a-1} \ e^{-\beta x} \ dx \\&=\frac{\beta^{a+b}}{\Gamma(a) \ \Gamma(b)} \ e^{-\beta s} \ \int_0^s x^{a-1} \ (s-x)^{b-1} \ dx \\&=\frac{\beta^{a+b}}{\Gamma(a) \ \Gamma(b)} \ e^{-\beta s} \ s^{a+b-1} \ \int_0^1 t^{a-1} \ (1-t)^{b-1} \ dt \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (3) \end{aligned}

See here for more information on how to use the convolution formula. The last step in $(3)$ is obtained by a change of variable in the integral from the step immediately above it by letting $x=st$. The last step in $(3)$ must equal to $(2)$. Setting the two equal would produce the equality in $(1)$.

Note that if the function $t^{a-1} \ (1-t)^{b-1}$ is normalized by the value $B(a,b)$, it would be a density function, which is the beta distribution. The following is the density function of the beta distribution.

$\displaystyle f(x)=\frac{\Gamma(a+b)}{\Gamma(a) \ \Gamma(b)} \ x^{a-1} \ (1-x)^{b-1}; \ \ \ \ \ \ \ \ 0

The beta distribution is further examined in the next post.

_______________________________________________________________________________________________
$\copyright \ 2016 - \text{Dan Ma}$