Lecture Notes

By

J.M.Kasayira

The Chi-square test

The Chi-square is a non-parametric test of significance. It is appropriate when the data are in the form of frequency counts occurring in two or more mutually exclusive categories (nominal variables). The chi square test examines whether people are distributed across the categories as would be expected by chance (which would mean there is no relationship between the independent and dependent variables), when taking sampling errors into account. If there is no relationship, the frequencies should be equal across the categories. If there is a relationship, then people won’t be distributed as expected by chance (e.g., more students pass MSc HRM Statistics compared to those who fail).

The Chi-square test for factorial designs is used when frequencies are categorized along more than one dimension. It tests the null hypothesis that two variables are independent of one another in the population.

Hypotheses

Chi-square is typically used for non-directional (two-tailed) tests. [ One –tailed test makes sense only when we are dealing with a sample outcome (two categories) that can go in one of two possible directions.]

H0: The observed distribution of frequencies equals the expected distribution of frequencies in each category.

H1: The observed distribution of frequencies does not equal the expected distribution of frequencies.

Five assumptions made

1.Each observation must fall in one and only one category

2.The observations in the sample are independent of one another

3.The observations are measured as frequencies

4.The expected frequency for each category must not be less than 5 for df 2 and not less than10 for df=1

5.The observed values of X2 with one degree of freedom must be corrected for continuity in order to use the table of values of X2 critical.

Decision rules for rejecting H0

(a) Reject H0 if: X2obs ≥ X2 crit (p, df)

(b) Do not reject H0 if: X2obs < X2crit (p, df)

One sample/one way Chi-square

The Chi-square test for one-way design (i.e. where frequencies are categorised along one dimension) is called a goodness-of-fit test. It is known as goodness of fit test because it examines how closely the observed frequencies, from a sample, fit theoretically expected frequencies based on a null hypothesis. Thus with aone-way chi-square participants are classified on only one dimension such as preference for elective modules. The test examines whether participants are distributed across the categories as would be expected by chance (which would mean there is no relationship between the independent and dependent variables), when taking sampling errors into account. If there is no relationship, the frequencies should be equal across the categories. If there is a relationship, then participants won’t be distributed as expected by chance (e.g., more students would choose one module compared to the other).

Example1

Given a class of 15 females and 49 males. Is there a significant difference in the number of female and male students in the group? Use the probability of 5%

Calculation

Formula:

Where: O = Observed frequencies

E = Expected frequencies

Step1:

Find the (theoretical) expected frequency (E) for theoretical reasons, we would expect 50% of the group to be female and 50% to be male (i.e.32.) female and 32 male students)

Step 2:

Find whether the observed (O) frequency (i.e. the actual number of female and males) differs significantly from the expected frequency (E).

We only need to take one of these, either females or males (the result will be the same) as observed frequency.

From males: O = 49; E = 32

Substitute in the formula:

= ∑

= ∑

=∑

= (NB: Here we can drop the ∑ without effect = 9.03 since we are using one O and one E)

Therefore X2obs is 9.03. To find the corresponding significant value of X2 (i.e. X2crit) there is need to know the number of degrees of freedom (df).

df = k-1 where k is the number of categories

Therefore df = 2-1 = 1

For 1 degree of freedom, two-tailed test at p = 0.05, X2crit = 3.84

Since our X2 obtained is bigger (9.03), the null hypothesis should be rejected. Thus it can be concluded that there is a significant difference between the number of male students and the number of female students in MSc HRM.

You are encouraged to use the X2 Summary table of results as shown below.

X2 summary table of results

Computed value for X2 / X2 = 9.03
Degree of freedom / df = 1
Significance level / p = 0.05
Critical value for X2 / X2 = 3.84

Region for rejection

/ Values ofX2 whichare equal to, or greater than 3.84

NB, to be in consistence with assumptions 5, Yates’ correction has to be used.

Use of Yates’ correction entails the following:

1.Subtract 0.5 from the observed frequency if the observed frequency is greater than the expected frequency; that is, if OE, subtract 0.5 from O.

2.Add 0.5 to the observed frequency if the observed frequency is less than the expected frequency that is, if OE, add 0.5 to O.

3.Alternatively use formula:

However, some statisticians disagree; (see Minium, King & Bear, 2004 p467).

Example 2

Sixty three MSU students reading for various Masters degree programmes were asked to choose days to present their seminar papers so as to be booked for the use of multi media projectors on one of three days. Thirty-two preferred the third day, 27 preferred the second day while only four preferred the first day. Is there a significant difference in the distribution of the students among the three days?

Calculation

Compute degrees of freedom so as to determine whether there is need to use Yates’ correction so as to be in consistence with assumptions 5.

df = k- 1

= 3-1

= 2

Therefore we apply the formula:

1.Remember the expected value is the theoretical distribution of frequencies. Thus to find E, simply divide the number of participants (N = 63) by the number of categories (3).

2.Compute, for each of the categories

  1. Set up a computational table to record the values as follows:

X2 computational table

Category / O / E / (O-E)2/E
First day / 4 / 21 / 13.76
Second day / 27 / 21 / 1.71
Last day / 32 / 21 / 3.85
/ 63 / 63 / 19.32

X2 = 13.76 + 1.71 + 3.85

= 19.32

Our df is 2.

At the .05 level, the critical value is 5.99. If the observed value of X2 is greater than the critical value, reject the null hypothesis. In this case we reject the null hypothesis.

X2summary table of results

Computed value for X2 / X2 = 19.32
Degree of freedom / df = 2
Significance level / p = 0.05
Critical value for X2 / X2 = 5.99

Region for rejection

/ Values ofX2 whichare equal to, or greater than,5.99

We can conclude that there is a significant difference in the preference of days by the MSU students reading for various Masters degree programmes.

Two-sampleand more than two sample X2

A two-sample or two-way X2 differs on one independent variable and is measured on a dependent variable. If we take example 2 above, the researcher may want to test whether males and females (independent variable: sex) differ significantly on their preferences (dependent variable: student preference). Thus the purpose of the X2 test for two-way design is to determine whether or not the two variables in the design are independent of one another or related.

Example 3

Suppose out of the 64 students in our first example, 10 females and 25 males passed Statistics. Is there any significant sex difference?

Category / Passed / Failed
Female / 10 (cell 1) / 5 (cell 2)
Male / 25 (cell 3) / 24 (cell 4)

Column 1 = 35 Column 2 = 29

Alternatively you can have a table like the one below

Category / Passed / Failed / Total
Female / 10 (cell 1) / 5 (cell 2) / Row 1 =15
Male / 25 (cell 3) / 24 (cell 4) / Row 2 = 49
Total / 35 / 29 / 64

df = (r-1)(c-1)

Where r = number of rows

c = number of columns

Thus df = (2-1)(2-1)

= (1)(1)

= 1

Since the number of degrees of freedom equals one we must use Yates’ correction for continuity. Hence we use the formula:

For two samples the formula for E is:

The formula has to be applied separately for each cell because the expected outcomes are different for each cell.

X2 computational table

O E |O- E|-0.5 (|O-E| - 0.5)2 (|O-E| - 0.5)2∕E

10 8.20 1.3 1.69 0.206

5 6.80 1.3 1.69 0.249

25 26.80 1.3 1.69 0.063

24 22.20 1.3 1.69 0.076

64 64.00 5.2 6.76 0.594

X2 = 0.206+0.249+0.063+0.076

= 0.594

= 0.59

For two tailed test at p=0.05; X2 (0.05,1= 3.84).

X2 summary table of results

Computed value for X2 / X2 = 0.59
Degree of freedom / df = 1
Significance level / p = 0.05
Critical value for X2 / X2 = 3.84

Region for rejection

/ Values ofX2 whichare equal to, or greater than, 3.84

As the X2 obtained (0.59) is less than the critical value (3.84), we fail to reject the null hypothesis. We conclude that there is no significant sex difference in student’ performance in Statistics.

Example 4

Suppose there are 98 students in a Statistic class. Seventy of these are females. If 40 female students passed, compared to 20 male students passing statistics, will there be any significant sex difference?

Calculation
Pass / Fail / Total
Females / 40 / 30 / 70
Males / 20 / 8 / 28
Totals / 60 / 38 / 98

df = 1: Thus we use the formula:

X2 computational table

O / E / /
20 / 17.143 / 2.357 / 0.324
40 / 42.857 / 2.357 / 0.130
30 / 27.143 / 2.357 / 0.205
8 / 10.857 / 2.357 / 0.512
 / 98 / 98.000 / 9.428 / 1.171

X2 = 1.171

X2 summary table of results

Computed value for X2 / X2 = 1.171
Degree of freedom / df = 1
Significance level / p = 0.05
Critical value for X2 / X2 = 3.84

Region for rejection

/ Values ofX2 whichare equal to, or greater than, 3.84

As the X2obtained (1.171) is less than the critical value (3.84), we fail to reject the null hypothesis.

We conclude that there is no significant sex difference in student’ performance in Statistics.

Example 5

Forty six Statistic students were sampled from three Zimbabwean universities: MSU, MASU and NUST. They were given a common Statistic test. Given the data below, is there a significant difference in their performance?

Category / Pass / Fail / Total
MSU / 15 / 2 / 17
MASU / 7 / 6 / 13
NUST / 4 / 12 / 16
Total / 26 / 20 / 46

Calculation

df = 2: Therefore we apply the formula,

X2 computational table

O / E / (O-E)2/E
15 / 9.61 / 3.02
7 / 7.35 / 0.02
4 / 9.04 / 2.81
2 / 7.39 / 3.93
6 / 5.65 / 0.02
12 / 6.96 / 3.91
 / 46 / 46 / 13.71

X2 = 3.04 + 0.022 + 2.78 + 3.94 + 0.016 + 3.57

X2 =13.71

For two degrees of freedom, the critical value at the 0.05 level is 5.99.

X2summary table of results

Computed value for X2 / X2 = 3.71
Degree of freedom / df = 2
Significance level / p = 0.05
Critical value for X2 / X2 = 5.99

Region for rejection

/ Values ofX2 whichare equal to, or greater than, 5.99

Since 13.71 is greater than 5.99, we can reject the null hypothesis and conclude that the differences are statistically significant. We conclude that there is no significant difference in the performance of students from the three universities.

Meeting the requirement of assumption number 4

Sometimes given data may not satisfy assumption number four: that is, the expected frequency for each category must not be less than 5 for df 2 and not less than10 for df =1. This applies to one –way and two-way designs. When one or more expected frequencies do not meet these criteria, it might be possible to collapse adjacent cells (one way designs) or to collapse adjacent rows, columns, or both (two-way designs) in order to increase the expected frequency to its minimally required value. In collapsing the combinations of cells or rows (columns) must make conceptual sense.

Example 6

A random sample of 100 MSU lecturers was performance appraised according to the following ratings: Outstanding. Satisfactory Data on the relationship between sex and job performance is provided in the table below.

Category / Male / Female / Total
Outstanding / 5 / 2 / 7
Competent / 30 / 13 / 43
Satisfactory / 25 / 8 / 33
Unsatisfactory / 5 / 3 / 8
Poor / 5 / 4 / 9
 / 70 / 30 / 100

In order to determine whether or not to collapse, the smallest expected frequency is calculated using the formula:

Since the smallest row total is 7 and the smallest column total is 30,

= 2.1

Since this expected frequency is less than 5 with a df=3, we should consider collapsing adjacent performance ratings. The top two and the bottom two categories could be combined quite logically as shown in the table below.

Category / Male / Female / Total
Competent / 35 / 15 / 50
Satisfactory / 25 / 8 / 33
Unsatisfactory / 10 / 7 / 17
 / 70 / 30 / 100

Now the smallest expected frequency is

= 5.1

Since df 2 and the smallest expected frequency exceeds 5, the requirements of assumption number 4 are now met. Therefore we can now do our calculations as before.

*Now do a complete hypothesis testing of example six, then do example 7 below. E-mail your answer to me before end of semester.

Example 7

The following data were collected in a study of relationship between average number of hours spent studying per day and performance in Statistics at MSU.

Hours per day / Performance / Total
Pass / Fail
 1 / 2 / 7 / 9
2 / 58 / 25 / 83
3 / 65 / 13 / 78
4 / 25 / 5 / 30
Total / 150 / 50 / 200

1