﻿ Chi-Square Tests

# Chi-Square Tests

Chi-Square tests can be used to test the association between two partitions of a population. This two dimensional classification is commonly displayed as a contingency table or cross tabulation where rows represent one categorical variable and columns represent the other. The null hypothesis is that there is no association between the two categorical variables.

The main assumption of this group of tests is that for any observation it can only belong to one cell in the contingency table.

Row and column totals (marginal totals) are used to predict what count would be expected for each cell if the null hypothesis were true. A test statistic which is approximately distributed as a chi-square variable is calculated from the observed and expected frequencies. The larger the test statistic (for given degrees of freedom) the more likely there is to be a statistically significant association between the two variables.

VisualStat performs the following Chi-Square test:

Definitions and Notation

In this section of VisualStat Help, a two-way table represents the crosstabulation of two categorical variables. Let the rows of the table be numbered by the values i=1,2, ..., r, and the columns by j=1,2,...,c. Let nij denote the cell frequency in the ith row and the jth column, we have the following definitions:

 • is the row totals • is the column totals • is the overall total • is the cell percentages • is the row percentages • is the column percentages

Measures of Association

VisualStat computes several statistics that describe the association between the two variables of the contingency table.

Phi Coefficient
The phi coefficient is a measure of association derived from the Pearson chi-square statistic. It has the range -1=Ø=1 for 2×2 tables. Otherwise, the range is 0=Ø=min(sqrt(r-1),sqrt(c-1)). The phi coefficient is computed as

 for 2×2 tables

 otherwise, where QP is Pearson's chi-square statistic

Contingency Coefficient
The contingency coefficient is a measure of association derived from the Pearson chi-square. The contingency coefficient is computed as

 ,where QP is Pearson's chi-square statistic

Cramer's V
The Cramer's V coefficient is a measure of association derived from the Pearson chi-square. It is designed so that the attainable upper limit is always 1. It has the range -1=V=1 for 2×2 tables; otherwise, the range is 0=V=1. Cramer's V is computed as

for 2×2 tables

 otherwise, where QP is Pearson's chi-square statistic

Kappa Coefficient
The simple kappa coefficient is a measure of agreement. Viewing the two response variables as two independent ratings of the n subjects, the kappa coefficient equals +1 when there is complete agreement of the raters. When the observed agreement exceeds chance agreement, the kappa coefficient is positive, with its magnitude reflecting the strength of agreement. Although unusual in practice, kappa is negative when the observed agreement is less than chance agreement. The minimum value of kappa is between -1 and 0, depending on the marginal proportions. The kappa coefficient is computed as

where

and