Chi-Square Tests

Chi-Square tests can be used to test the association between two partitions of a population. This two dimensional classification is commonly displayed as a contingency table or cross tabulation where rows represent one categorical variable and columns represent the other. The null hypothesis is that there is no association between the two categorical variables.

The main assumption of this group of tests is that for any observation it can only belong to one cell in the contingency table.

Row and column totals (marginal totals) are used to predict what count would be expected for each cell if the null hypothesis were true. A test statistic which is approximately distributed as a chi-square variable is calculated from the observed and expected frequencies. The larger the test statistic (for given degrees of freedom) the more likely there is to be a statistically significant association between the two variables.

VisualStat performs the following Chi-Square test:

Pearson's Chi-Square Test

Likelihood-Ratio Chi-Square Test

McNemar’s Chi-Square Test

Bowker's Chi-Square Test of Symmetry

 

 

Definitions and Notation

In this section of VisualStat Help, a two-way table represents the crosstabulation of two categorical variables. Let the rows of the table be numbered by the values i=1,2, ..., r, and the columns by j=1,2,...,c. Let nij denote the cell frequency in the ith row and the jth column, we have the following definitions:

 

chi2.3

 is the row totals

chi2.4

 is the column totals

chi2.5

 is the overall total

chi2.6

 is the cell percentages

chi2.7

 is the row percentages

chi2.8

 is the column percentages

 

 

 

Measures of Association

VisualStat computes several statistics that describe the association between the two variables of the contingency table.

 

Phi Coefficient
The phi coefficient is a measure of association derived from the Pearson chi-square statistic. It has the range -1=Ø=1 for 2×2 tables. Otherwise, the range is 0=Ø=min(sqrt(r-1),sqrt(c-1)). The phi coefficient is computed as

chi2.1

for 2×2 tables

 

chi2.2

 otherwise, where

QP

is Pearson's chi-square statistic

 

 

Contingency Coefficient
The contingency coefficient is a measure of association derived from the Pearson chi-square. The contingency coefficient is computed as

chi2.14

 ,where

QP

is Pearson's chi-square statistic

 

 

Cramer's V
The Cramer's V coefficient is a measure of association derived from the Pearson chi-square. It is designed so that the attainable upper limit is always 1. It has the range -1=V=1 for 2×2 tables; otherwise, the range is 0=V=1. Cramer's V is computed as

 

chi2.12    for 2×2 tables

chi2.13

otherwise, where

QP

is Pearson's chi-square statistic

 

 

Kappa Coefficient
The simple kappa coefficient is a measure of agreement. Viewing the two response variables as two independent ratings of the n subjects, the kappa coefficient equals +1 when there is complete agreement of the raters. When the observed agreement exceeds chance agreement, the kappa coefficient is positive, with its magnitude reflecting the strength of agreement. Although unusual in practice, kappa is negative when the observed agreement is less than chance agreement. The minimum value of kappa is between -1 and 0, depending on the marginal proportions. The kappa coefficient is computed as

 

chi2.9

where

 

  chi2.10  and

  chi2.11