The chisquared distribution has a simple definition from a mathematical standpoint and yet plays an important role in statistical sampling theory. This post is the first post in a threepart series that gives a mathematical story of the chisquared distribution.
This post is an introduction which highlights the fact that mathematically chisquared distribution arises from the gamma distribution and that the chisquared distribution has an intimate connection with the normal distribution. This post lays the ground work for the subsequent post.
The next post (Part 2) describe the roles played by the chisquared distribution in forming the various sampling distributions related to the normal distribution. These sampling distributions are used for making inference about the population from which the sample is taken. The population parameters of interest here are the population mean, variance, and standard deviation. The population from which the sample is take is assumed to be modeled adequately by a normal distribution.
Part 3 describes the chisquared test, which is used for making inference on categorical data (versus quantitative data).
These three parts only scratches the surface with respect to the roles played the chisquared distribution in statistics. Thus the discussion in this series only serves as an introduction on chisquared distribution.
_______________________________________________________________________________________________
Defining the ChiSquared Distribution
A random variable is said to follow the chisquared distribution with degrees of freedom if the following is the density function of .
where is a positive integer. In some sources, the distribution is sometimes named distribution. Essentially the distribution defined in is a gamma distribution with shape parameter and scale parameter 2 (or rate parameter ). Note that the chisquared distribution with 2 degrees of freedom (when ) is simply an exponential distribution with mean 2. The following figure shows the chisquared density functions for degrees of freedom 1, 2, 3, 5 and 10.
Figure 1 – Chisquared Density Curves
Just from the gamma connection, the mean and variance are and . In other words, the mean of a chisquared distribution is the same as the degrees of freedom and its variance is always twice the degrees of freedom. As a gamma distribution, the higher moments are also known. Consequently the properties that depend on can be easily computed. See here for the basic properties of the gamma distribution. The following gives the mean, variance and the moment generating function (MGF) for the chisquared random variable with degrees of freedom.
_______________________________________________________________________________________________
Independent Sum of ChiSquared Distributions
In general, the MGF of an independent sum is simply the product of the MGFs of the individual random variables . Note that the product of Chisquared MGFs is also a Chisquared MGF, with the exponent being the sum of the individual exponents. This brings up another point that is important for the subsequent discussion, i.e. the independent sum of chisquared distributions is also a chisquared distribution. The following theorem states this fact more precisely.
Theorem 1
If are chisquared random variables with degrees of freedom , respectively, then the independent sum has a chisquared distribution with degrees of freedom.
Thus the result of summing independent chisquared distributions is another chisquared distribution with degree of freedom being the total of all degrees of freedom. This follows from the fact that if the gamma distributions have identical scale parameter, then the independent sum is a gamma distribution with the shape parameter being the sum of the shape parameters. This point is discussed in more details here.
_______________________________________________________________________________________________
The Connection with Normal Distributions
As shown in the above section, the chisquared distribution is simple from a mathematical standpoint. Since it is a gamma distribution, it possesses all the properties that are associated with the gamma family. Of course, the gamma connection is far from the whole story. One important fact is that the chisquared distribution is naturally obtained from sampling from a normal distribution.
Theorem 2
Suppose that the random variable follows a standard normal distribution, i.e. the normal distribution with mean 0 and standard deviation 1. Then follows a chisquared distribution with 1 degree of freedom.
Proof
By definition, the following is the cumulative distribution function (CDF) of .
Upon differentiating , the density function is obtained.
Note that the density is that of a chisquared distribution with 1 degree of freedom.
With the basic result in Theorem 1, there are more ways to obtain chisquared distributions from sampling from normal distributions. For example, first normalizing a sample item from normal sampling and then squaring it will produce a chisquared observation with 1 degree of freedom. Similarly, by performing the same normalizing in each sample item in a normal sample and by squaring each normalized observation, the resulting sum is a chisquared distribution. These are made more precise in the following corollaries.
Corollary 3
Suppose that the random variable follows a normal distribution with mean and standard deviation . Then follows a chisquared distribution with 1 degree of freedom.
Corollary 4
Suppose that is a random sample drawn from a normal distribution with mean and standard deviation . Then the following random variable follows a chisquared distribution with degrees of freedom.
_______________________________________________________________________________________________
Calculating ChiSquared Probabilities
In working with the chisquared distribution, it is necessary to evaluate the cumulative distribution function (CDF). In hypothesis testing, it is necessary to calculate the pvalue given the value of the chisquared statistic. In confidence interval estimation, it is necessary to determine the the critical value at a given confidence level. The standard procedure at one point in time is to use a chisquared table. A typical chisquared table can be found here. We demonstrate how to find chisquared probabilities first using the table approach and subsequently using software (Excel in particular).
The table gives the probabilities on the right tail. The table in the link given above will give the chisquared value (on the xaxis) for a given area of the right tail () per df. This table lookup is illustrated in the below diagram.
Figure 2 – Right Tail of Chisquared Distribution
For df = 1, , thus and . So for df = 1, the 90th percentile of the chisquared distribution is 2.706. The following shows more table lookup.

df = 2, .
and
The 99th percentile of the chisquared distribution with df = 2 is 9.210.
df = 15, .
and
The 10th percentile of the chisquared distribution with df = 15 is 8.547.
The choices for in the table are limited. Using software will have more selection for and will give more precise values. For example, Microsoft Excel provides the following two functions.

=CHISQ.DIST(x, degree_freedom, cumulative)
=CHISQ.INV(probability, degree_freedom)
The two functions in Excel give information about the lefttail of the chisquared distribution. The function CHISQ.DIST returns the lefttailed probability of the chisquared distribution. The parameter cumulative is either TRUE or FALSE, with TRUE meaning that the result is the cumulative distribution function and FALSE meaning that the result is the probability density function. On the other hand, the function CHISQ.INV returns the inverse of the lefttailed probability of the chisquared distribution.
If the goal is to find probability given an xvalue, use the function CHISQ.DIST. On the other hand, if the goal is to look for the xvalue given the lefttailed value (probability), then use the function CHISQ.INV. In the table approach, once the value is found, the interplay between the probability () and xvalue is clear. In the case of Excel, one must choose the function first depending on the goal. The following gives the equivalent results for the table lookup presented above.

=CHISQ.DIST(2.706, 1, TRUE) = 0.900028622
=CHISQ.INV(0.9, 1) = 2.705543454
=CHISQ.DIST(9.21, 2, TRUE) = 0.989998298
=CHISQ.INV(0.99, 2) = 9.210340372
=CHISQ.DIST(8.547, 15, TRUE) = 0.100011427
=CHISQ.INV(0.1, 15) = 8.546756242
_______________________________________________________________________________________________