The chi-squared test is a useful and versatile test. There are several interpretations of the chi-squared test, which are discussed in three previous posts. The different uses of the same test can be confusing to the students. This post attempts to connect the ideas in the three previous posts and to supplement the previous discussions.
The chi-squared test is based on the chi-squared statistic, which is a measure of magnitude of the difference between the observed counts and the expected counts in an experimental design that involves one or more categorical variables. The null hypothesis is the assumption that the observed counts and expected counts are the same. A large value of the chi-squared statistic gives evidence for the rejection of the null hypothesis.
The chi-squared test is also simple to use. The chi-squared statistic has an approximate chi-squared distribution, which makes it easy to evaluate the sample data. The chi-squared test is included in various software packages. For applications with a small number of categories, the calculation can even be done with a hand-held calculator.
The Goodness-of-Fit Test and the Test of Homogeneity
The three different uses of the test as discussed in the three previous posts can be kept straight by having a firm understanding of the underlying experimental design.
For the goodness-of-fit test, there is only one population involved. The experiment is to measure one categorical variable on one population. Thus only one sample is used in applying the chi-squared test. The one-sample data would produce the observed counts for the categorical variable in questions. Let’s say the variable has cells. Then there would be observed counts. The expected counts for the cells would come from an hypothesized distribution of the categorical variable. The chi-squared statistic is then the sum of squared differences of the observed and expected counted (normalized by dividing the expected counts). Essentially the hypothesized distribution is the null hypothesis. More specifically, the null hypothesis would be the statement that the cell probabilities are derived from the hypothesized distribution.
As a quick example, we may want to answer the question whether a given die is a fair die. We then observe rolls of the die and classify the rolls into 6 cells (the value of 1 to 6). The null hypothesis is that the values of the die follow a uniform distribution. Another way to state the hypothesis is that each cell probability is 1/6. Another example is the testing of the hypothesis of whether the claim frequency of a group of insured drivers follows a Poisson distribution. The cell probabilities are then calculated based on the assumption of a Poisson distribution. In short, the goodness-of-fit test is to test whether the observed counts for one categorical variable come from (or fit) a hypothesized distribution. See Example 1 and Example 2 in the post on goodness-of-fit test.
In the test of homogeneity, the focus is to compare two or more populations (or two or more subpopulations of a population) on the same categorical variable, i.e. whether the categorical variable in question follow the same distribution across the different populations. For example, do two different groups of insured drivers exhibit the same claim frequency rates? For example, do adults with different educational attainment levels have the same proportions of current smokers/former smokers/never smokers? For example, are political affiliations similar across racial/ethnic groups? In this test, the goal is to determine whether cells in the categorical variable have the same proportions across the populations, hence the name of test of homogeneity. In the experiment, researchers would sample each population (or group) separately on the categorical variable in questions. Thus there will be multiple samples (one for each group) and the samples are independent.
In the test of homogeneity, the calculation of the chi-squared statistic would involve adding up the squared differences of the observed counts and expected counts for the multiple samples. For illustration, see Example 1 and Example 2 in the post on test of homogeneity.
Test of Independence
The test of independence can be confused with the test of homogeneity. It is possible that the objectives for both tests are similar. For example, a test of hypothesis might seek to determine whether the proportions of smoking statuses (current smoker, former smoker and never smoker) are the same across the groups with different education levels. This sounds like a test of homogeneity since it seeks to determine whether the distribution of smoking status is the same across the different groups (levels of educational attainment). However, a test of independence can also have this same objective.
The difference between the test of homogeneity and the test of independence is one of experimental design. In the test of homogeneity, the researchers sample each group (or population) separately. For example, they would sample individuals from groups with various levels of education separately and classify the individuals in each group by smoking status. The chi-squared test to use in this case is the test of homogeneity. In this experimental design, the experimenter might sample 1,000 individuals who are not high school graduate, 1,000 individuals who are high school graduates, 1,000 individuals who have some college and so on. Then the experimenter would compare the distribution of smoking status across the different samples.
An experimenter using a test of independence might try to answer the same question but is proceeding in a different way. The experimenter would sample the individuals from a given population and observe two categorical variables (e.g. level of education and smoking status) for the same individual.
Then the researchers would classify each individual into a cell in a two-way table. See Table 3b in the previous post on test of independence. The values of the level of education go across the column in the table (the column variable). The values of the smoking status go down the rows (the row variable). Each individual in the sample would belong to one cell in the table according to the values of the row and column variables. The two-way table is to help determine whether the row variable and the column variable are associated in the given population. In other words, the experimenter is interested in finding out whether one variable explains the other (or one variable affects the other).
For the sake of ease in the discussion, let’s say the column variable (level of education) is the explanatory variable. The experimenter would then be interested in whether the conditional distribution of the row variable (smoking status) would be similar or different across the columns. If the conclusion is similar, it means that the column variable does not affect the row variable (or the two variables are not associated). This would also mean that the distribution of smoking status are the same across the different levels of education (a conclusion of homogeneity).
If the conclusion is that the conditional distribution of the row variable (smoking status) would be different across the columns, then the column variable does affect the row variable (or the two variables are associated). This would also mean that the distribution of smoking status are different across the different levels of education (a conclusion of non-homogeneity).
The test of independence and the test of homogeneity are based on two different experimental designs. Hence their implementations of the chi-squared statistic are different. However, each design can be structured to answer similar questions.
2017 – Dan Ma