This is the part 2 of a 3-part series on the chi-squared distribution. In this post, we discuss several theorems, all centered around the chi-squared distribution, that play important roles in inferential statistics for the population mean and population variance of normal populations. These theorems are the basis for the test statistics used in the inferential procedures.

We first discuss the setting for the inference procedures. Then discuss the pivotal theorem (Theorem 5). We then proceed to discuss that theorems that produce the test statistics for , the population mean of a normal population and for , a difference of two population means from two normal populations. The discussion then shifts to the inference procedures on population variance.

_______________________________________________________________________________________________

**The Settings**

To facilitate the discussion, we use the notation to denote the normal distribution with mean and variance . Whenever the random variable follows such as distribution, we use the notation .

The setting for making inference on one population is that we have a random sample , drawn from a normal population . The sample mean and the sample variance are unbiased estimators of and , respectively, given by:

The goal is to use the information obtained from the sample, namely and , to estimate or make decisions about the unknown population parameters and .

Because the sample is drawn from a normal population, the sample has a normal distribution, more specifically , which has two unknown parameters. To perform inferential procedures on the population mean , it is preferable to have a test statistic that depends on only. To this end, a t-statistic is used (see Theorem 7), which has the t-distribution with degrees of freedom (one less than the sample size). Because the parameter is replaced by the sample variance , the t-statistic has only as the unknown parameter.

On the other hand, to perform inferential procedures on the population variance , we use a statistic that has a chi-squared distribution and that has only one unknown parameter (see Theorem 5).

Now, the setting for performing inference on two normal populations. Let be a random sample drawn from the distribution . Let be a random sample drawn from the distribution . Because the two samples are independent, the difference of the sample means has a normal distribution. Specifically, . Theorem 8 gives a t-statistic that is in terms of the difference such that the two unknown population variances are replaced by the pooled sample variance. This is done with the simplifying assumption that the two population variances are identical.

On the other hand, the inference on the population variances and , a statistic that has the F distribution can be used (See Theorem 10). One caveat is that this test statistic is sensitive to non-normality.

_______________________________________________________________________________________________

**Connection between Normal Distribution and Chi-squared Distribution**

There is an intimate relation between the sample items from a normal distribution and the chi-squared distribution. This is discussed in Part 1. Let’s recall this connection. If we normalize one sample item and then square it, we obtain a chi-squared random variable with df = 1. Likewise, if we normalize each sample item and then square it, the sum of the squares will be a chi-squared random variable with df = . The following results are discussed in Part 1 and are restated here for clarity.

*Theorem 2*

Suppose that the random variable follows a standard normal distribution, i.e. the normal distribution with mean 0 and standard deviation 1. Then follows a chi-squared distribution with 1 degree of freedom.

*Corollary 3*

Suppose that the random variable follows a normal distribution with mean and standard deviation . Then follows a chi-squared distribution with 1 degree of freedom.

*Corollary 4*

Suppose that is a random sample drawn from a normal distribution with mean and standard deviation . Then the following random variable follows a chi-squared distribution with degrees of freedom.

_______________________________________________________________________________________________

**A Pivotal Theorem**

The statistic in Corollary 4 has two unknown parameters and . It turns out that the statistic will become more useful if is replaced by the sample mean . The cost is that one degree of freedom is lost in the chi-squared distribution. The following theorem gives the details. The result is a statistic that is a function of the sample variance and the population variance .

*Theorem 5*

Let be a random sample drawn from a normal distribution with mean and variance . Then the following conditions hold.

- The sample mean and the sample variance are independent.
- The statistic has a chi-squared distribution with degrees of freedom.

*Proof of Theorem 5*

We do not prove the first bullet point. For a proof, see Exercise 13.93 in [2]. For the second bullet point, note that

Note that in expanding , the sum of the middle terms equals to 0. Furthermore the result can be restated as follows:

where and . Note that is a standard normal random variable. Thus has a chi-squared distribution with df = 1 (by Theorem 2). Since is an independent sum of squares of standardized normal variables, has a chi-squared distribution with degrees of freedom. Furthermore, since and are independent, and are independent. Let . As a result, and are independent. The following gives the moment generating function (MGF) of .

Since and follow chi-squared distributions, we can plug in the chi-squared MGFs to obtain the MGF of the random variable .

The MGF for is that of a chi-squared distribution with degrees of freedom.

*Remark*

It is interesting to compare the following two quantities:

The first quantity is from Corollary 4 and has a chi-squared distribution with degrees of freedom. The second quantity is from Theorem 5 and has a chi-squared distribution with degrees of freedom. Thus the effect of Theorem 5 is that by replacing the population mean with the sample mean , one degree of freedom is lost in the chi-squared distribution.

Theorem 5 is a pivotal theorem that has wide applications. For our purposes at hand, it can be used for inference on both the mean and variance. Even though one degree of freedom is lost, the statistic is a function of one unknown parameter, namely the population variance . Since the sampling distribution is known (chi-squared), we can make probability statement about the statistic. Hence the statistic is useful for making inference about the population variance . As we will see below, in conjunction with other statistics, the statistic in Theorem 5 can be used for inference of two population variances as well as for inference on the mean (one sample and two samples).

_______________________________________________________________________________________________

**Basis for Inference on Population Mean**

Inference on the population mean of a single normal population and on the difference of the means of two independent normal populations relies on the t-statistic. Theorem 6 shows how to obtain a t-statistic using a chi-squared statistic and the standard normal statistic. Theorem 7 provides the one-sample t-statistic and Theorem 8 provides the two-sample t-statistic.

*Theorem 6*

Let be the standard normal random variable. Let be a random variable that has a chi-squared distribution with degrees of freedom. Then the random variable

has a t-distribution with degrees of freedom and its probability density function (PDF) is

*Remark*

The probability density function given here is not important for the purpose at hand. For the proof of Theorem 6, see [2]. The following two theorems give two applications of Theorem 6.

*Theorem 7*

Let be a random sample drawn from a normal distribution with mean and variance . Let be the sample variance defined in . Then the random variable

has a t-distribution with degrees of freedom.

*Proof of Theorem 7*

Consider the following statistics.

Note that has the standard normal distribution. By Theorem 5, the quantity has a chi-square distribution with df = . By Theorem 6, the following quantity has a t-distribution with df = .

The above result is obtained after performing algebraic simplification.

*Theorem 8*

Let be a random sample drawn from a normal distribution with mean and variance . Let be a random sample drawn from a normal distribution with mean and variance . Suppose that . Then the following statistic:

has a t-distribution with df = where .

Note that is the pooled variance of the two sample variances and .

*Proof of Theorem 8*

First, the sample mean has a normal distributions with mean and variance and , respectively. The sample mean has a normal distribution with mean and variance and , respectively. Since the two samples are independent, and are independent. Thus the sample difference has a normal distribution with mean and variance . The following is a standardized normal random variance:

On the other hand, by Theorem 5 the following quantities have chi-squared distributions with degrees of freedom and , respectively.

Because the two samples are independent, the two chi-squared statistics are independent. Then the following is a chi-squared statistic with degrees of freedom.

By Theorem 6, the following ratio

has a t-distribution with degrees of freedom. Here’s where the simplifying assumption of is used. Plugging in this assumption gives the following:

where is the pooled sample variance of the two samples as indicated above.

_______________________________________________________________________________________________

**Basis for Inference on Population Variance**

As indicated above, the statistic given in Theorem 5 can be used for inference on the variance of a normal population. The following theorem gives the basis for the statistic used for comparing the variances of two normal populations.

*Theorem 9*

Suppose that the random variables and are independent chi-squared random variables with and degrees of freedom, respectively. Then the statistic

has an F-distribution with and degrees of freedom.

*Remark*

The F-distribution depends on two parameters and . The order they are given is important. We regard the first parameter as the degrees of freedom of the chi-squared distribution in the numerator and the second parameter as the degrees of freedom of the chi-squared distribution in the denominator.

It is not important to know the probability density functions for both the t-distribution and the F-distribution (in both Theorem 6 and Theorem 9). When doing inference procedures with these distributions, either tables or software will be used.

Given two independent normal random samples and (as discussed in the above section on the settings of inference), the sample variance is an unbiased estimator of the population variance of the first population, and the sample variance is an unbiased estimator of the population variance of the second population. It seems to make sense that the ratio can be used to make inference about the relative magnitude of and . The following theorem indicates that this is a valid approach.

*Theorem 10*

Let be a random sample drawn from a normal distribution with mean and variance . Let be a random sample drawn from a normal distribution with mean and variance . Then the statistic

has the F-distribution with degrees of freedom and .

*Proof of Theorem 10*

By Theorem 5, has a chi-squared distribution with degrees of freedom and has a chi-squared distribution with degrees of freedom. By Theorem 9, the following statistic

has the F-distribution with and degrees of freedom. The statistic is further simplified to become the statistic as stated in the theorem.

_______________________________________________________________________________________________

**Concluding Remarks**

Theorem 7 and Theorem 8 produce the one-sample t-statistic and two-sample t-statistic, respectively. They are procedures for inference about one population mean and the difference of two population means, respectively. They can be used for estimation (e.g. construction of confidence intervals) or decision making (e.g. hypothesis testing). On the other hand, Theorem 5 produces a chi-squared statistic for inference about one population variance. Theorem 10 produces an F-statistic that can be used for inference about two population variances. Since the F-statistic is a ratio of the two population variances, it can be used for inference about the relative magnitude of the variances.

The purpose in this post is to highlight the important roles of the chi-squared distribution. We now discuss briefly the quality of the derived statistical procedures. The procedures discussed here (t or F) are exactly correct if the populations from which the samples are drawn are normal. Real life data usually do not exactly follow normal distributions. Thus the usefulness of these statistics in practice depends on how strongly they are affected by non-normality. In other words, if there is a significant deviation from the assumption of normal distribution, are these procedures still reliable?

A statistical inference procedure is called robust if the calculated results drawn from the procedure are insensitive to deviations of assumptions. For a non-robust procedure, the result would be distorted if there is deviation from assumptions. For example, the t procedures are not robust against outliers. The presence of outliers in the data can distort the results since the t procedures are based on the sample mean and sample variance , which are not resistant to outliers.

On the other hand, the t procedures for inference about means are quite robust against slight deviation of normal population assumptions. The F procedures for inference about variances are not so robust. So it must be used with care. Even if there is a slight deviation from normal assumptions, the results from the F procedures may not be reliable. For a more detailed but accessible discussion on robustness, see [1].

When the sample sizes are large, the sample mean is close to a normal distribution (this result is called the central limit theorem). So the discussion about deviation of normality assumption is no longer important. When the sample sizes are large, simply use the Z statistic for inference about the means. On the other hand, when the sample sizes are large, the sample variance will be an accurate estimate of the population variance regardless of the assumption of the population distribution. This fact is related to the law of large numbers. Thus the statistical procedures described here are for small sample sizes and for assuming normal populations.

_______________________________________________________________________________________________

**Reference**

- Moore D. S., McCabe G. P., Craig B. A.,
*Introduction to the Practice of Statistics*, 7th ed., W. H. Freeman and Company, New York, 2012 - Wackerly D. D., Mendenhall III W., Scheaffer R. L.,
*Mathematical Statistics with Applications*, Thomson Learning, Inc, California, 2008

_______________________________________________________________________________________________