1.Members of a certain club are required to register for one of three games, billiards, snooker or darts.
The number of club members of each gender choosing each game in a particular year is shown in the table below.
Billiards / Snooker / DartsMale / 39 / 16 / 8
Female / 21 / 14 / 17
(a)Use a (Chi-squared) test at the 5% significance level to test whether choice of games is independent of gender. State clearly the null and alternative hypotheses tested, the expected values, and the number of degrees of freedom used.
(13)
The following year the choice of games was widened and the figures for that year are as follows:
Billiards / Snooker / Darts / FencingMale / 4 / 15 / 8 / 10
Female / 10 / 21 / 17 / 37
(b)If the test were applied to this new set of data,
(i)why would it be necessary to combine billiards with another game?
Result is less than 5 for male billiards
(ii)which other game would you combine with billiards and why?
(2)
A club member is to be selected at random.
(c)What is the probability that the club member selected is a
(i)female who chose billiards or snooker?
(ii)male or female who chose darts or fencing?
(2)
(Total 17 marks)
(a)
Billiards / Snooker / Darts / TotalsMale
Expected / 32.9 / 16.4 / 13.7 / 63
Female
Expected / 27.1 / 13.6 / 11.3 / 52
60 / 30 / 25 / 115 / (A3)
H0: Choice of game is independent of gender(A1)
H1: Choice of game is not independent of gender(A1)
Degree of freedom: (3 – 1)(2 – 1) = 2(A1)
(M2)
= 7.77 (3 s.f.) [or 7.79 from GDC](A1)
But (2) = 5.99 (from table)(M1)
2 = 7.77 > (2) and we do reject H0(A1)(R1)
Hence: Choice of game is dependent on gender.(A1)13
(b)(i)The frequency for males choosing Billiards is less than 5(R1)
(ii)Snooker – In order to preserve the diversity of games(R1)
OR
Darts – it has the next smallest number of members(R1)2
(c)(i)or 0.254 (3 s.f.)(A1)
(ii)or 0.590 (3 s.f.)(A1)2
[17]
2.For his Mathematical Studies Project a student gave his classmates a questionnaire to fill out. The results for the question on the gender of the student and specific subjects taken by the student are given in the table below, which is a 2 × 3 contingency table of observed values.
History / Biology / FrenchFemale / 22 / 20 / 18 / (60)
Male / 20 / 11 / 9 / (40)
(42) / (31) / (27)
The following is the table for the expected values.
History / Biology / FrenchFemale / p / 18.6 / 16.2
Male / q / r / 10.8
(a)Calculate the values of p, q and r.
(3)
The chi-squared test is used to determine if the choice of subject is independent of gender, at the 5% level of significance.
(b)(i)State a suitable null hypothesis H0.
(ii)Show that the number of degrees of freedom is two.
(iii)Write down the critical value of chi-squared at the 5% level of significance.
(3)
(c)The calculated value of chi-squared is 1.78. Do you accept H0? Explain your answer.
(2)
(Total 8 marks)
(a)p = 25.2q = 16.8r = 12.4(A1)(A1)(A1)3
(b)(i)H0: There is no connection between gender and subject taken.(C1)
(ii)Degrees of freedom = (3 – 1)(2 – 1) = 2 × 1(M1)
= 2(AG)
(iii)2(2) = 5.99(A1)3
(c)Accept H0(C1)
Since 1.78 < 5.99(R1)2
[8]
3.A survey was conducted in a company to determine whether position in upper management was independent of gender. The results of this survey are tabulated below.
Managers / Juniorexecutives / Senior
executives / Totals
Male / 95 / 130 / 75 / 300
Female / 65 / 110 / 25 / 200
Totals / 160 / 240 / 100 / 500
The table below shows the expected number of males and females at each level, if they were represented proportionally to the total numbers of males and females employed.
Managers / Juniorexecutives / Senior
executives / Totals
Male / a / c / 60 / 300
Female / b / d / 40 / 200
Totals / 160 / 240 / 100 / 500
(a)(i)Show that the expected number of Male Managers (a) is 96.
(ii)Hence find the values of b, c and d.
(5)
(b)(i)Write a suitable null hypothesis for this data.
(ii)Write a suitable alternate hypothesis for this data.
(2)
(c)(i)Perform a chi-squared test of independence for this data to show the value of χ2 is 12.8 to 3 significant figures.
(ii)Calculate the number of degrees of freedom, and write down the critical value of χ2 at the 5% significance level.
(iii)What conclusion can be drawn regarding gender and position in upper management?
(6)
(Total 13 marks)
(a)(i)Expected number of male managers
= × 500 = (M1)(A1)
= 96(AG)
(ii)b = 160 – 96 = 64(A1)
c = 300 – 96 – 60 = 144(A1)
d= 240 – 144 = 96(A1)5
(b)(i)H0: Position is independent of gender(A1)
(ii)H1: Position is dependent on gender(A1)2
(c)(i)
(M1)(A2)
Note: Award (M1) for using , (A2) for all values correct.
Special case: Award (A1) if only 1 value is incorrect.
= 12.8(AG)
(ii)2 degrees of freedom(A1)
= 5.991(A1)
(iii)Any of: (then reject H0, accept H1)(R1)6
Position is dependent on gender.
[13]
4.In the small town of Joinville, population 1000, an election was held.
The results were as follows:
Candidate A / 295 / 226
Candidate B / 313 / 166
In (a) to (c) below we will use a chi-squared test to decide whether the choice of candidate depends on where the voter lives.
Null Hypothesis H0:The choice of candidate is independent of where
the voter lives.
(a)(i)Write down the alternative hypothesis.
(ii)Use the information above to fill in a andb in the table below.
Cell / f0 / fe / f0 – fe / (f0 – fe)21 / 295 / 317 / –22 / 484
2 / 226 / 204 / 22 / 484
3 / 313 / 291 / 22 / 484
4 / 166 / a / b / 484
(3)
(b)(i)Calculate the chi-squared statistic.
(ii)Write the number of degrees of freedom.
(iii)At the 5% confidence level, state the chi-squared critical value.
(5)
(c)(i)Hence, state your conclusion.
(ii)Explain why you reached this conclusion.
(2)
(Total 10 marks)
(a)(i)Alternative hypothesis: Choice of candidates depends on
voter location.(A1)
(ii)a =188(A1)
b = –22(A1)3
(b)(i)(M1)
= 8.14 (accept 8.13)(A1)
Note: Award (G2) for 7.97.
(ii)v = 1(A1)
(iii)2(0.95, 1)(M1)
= 3.84(A1)5
(c)(i)Where voters live does affect how they vote.(A1)
(ii)2 or 8.14 > 3.84 so we reject the null hypothesis.(R1)2
[10]
5.A bag containing 60 sweets is opened. The bag contains sweets of the following colours.
Colour / FrequencyRed / 18
Orange / 17
Green / 10
Purple / 9
Blue / 6
According to the manufacturer, the various colours should have the following percentages.
Colour / PercentageRed / 35%
Orange / 25%
Green / 20%
Purple / 15%
Blue / 5%
(a)Calculate the expected number of sweets of each colour in a bag containing exactly 60 sweets.
(3)
Before you can perform the chi-squared test on this data, it is necessary to combine the data for one of the colours with that of another colour.
(b)Which colour is this and why is this necessary?
(2)
(c)Using the chi-squared test at the 5% significance level, investigate the hypothesis that the sweets in the packet may be regarded as a random sample. Remember to state the null hypothesis, the number of degrees of freedom and the critical value of chi-squared.
(7)
(Total 12 marks)
(a)
Colour / % / ExpectedRed / 35 / 21
Orange / 25 / 15 / (A3) / 3
Green / 20 / 12
Purple / 15 / 9
Blue / 5 / 3
Note: Award (A3) for all 5 correct expected values, (A2) for 4 correct and (A1) for 3 correct.
(b)Blue(A1)
It has less than 5 sweets(R1)2
(c)H0 = Colour of sweets is a random sample(A1)
d.o.f. = (4 – 1)(2 – 1) = 3(A1)
critical value = 7.815(A1)
2 = (M1)
= 1.78(A1)
1.78 < 7.815(A1)
Therefore, accept H0(A1)7
[12]
6.The veterinarian has gathered the following data about the weight of dogs and the weight of their puppies.
Dog / TotalHeavy / Light
Heavy / 36 / 27 / 63
Puppy / Light / 22 / 35 / 57
Total / 58 / 62 / 120
The veterinarian wishes to test the following hypotheses.
H0: A puppy’s weight is independent of its parent’s weight.
H1: A puppy’s weight is related to the weight of its parent.
(a)The table below sets out the elements required to calculate the value for this data.
fo / fe / fe – fo / (fe – fo)2 / (fe – fo)2 / feheavy/heavy / 36 / 30.45 / –5.55 / 30.8025 / 1.012
heavy/light / 27 / 32.55 / 5.55 / 30.8025 / 0.946
light/heavy / 22 / 27.55 / 5.55 / 30.8025 / 1.118
light/light / 35 / a / b / c / d
(i)Write down the values of a, b, c, and d.
(4)
(ii)What is the value of for this data?
(1)
(iii)How many degrees of freedom exist for the contingency table?
(1)
(iv)Write down the critical value of for the 5% significance level.
(1)
(b)Should H0 be accepted? Explain why.
(2)
(Total 9 marks)
(a)(i)a = 29.45,b = –5.55,c = 30.8025,d = 1.046(A4)
(ii) = 1.012 + 0.946 + 1.118 + 1.046 = 4.12(A1)
(iii)degrees of freedom = (2 –1) (2 –1) = 1(A1)
(iv) = 3.84(A1)7
(b)Do not accept H0. The weight of a puppy is related to the weight
of the parent.(A1)(R1)2
[9]
7.The following table of observed results gives the number of candidates taking a Mathematics examination classified by gender and grade obtained.
Grade5, 6 or 7 / 3 or 4 / 1 or 2 / Total
Males / 5000 / 3400 / 600 / 9000
Gender / Females / 6000 / 4000 / 1000 / 11000
Total / 11000 / 7400 / 1600 / 20000
The question posed is whether gender and grade obtained are independent.
(a)Show clearly that the expected number of males achieving a grade of 5, 6 or 7 is 4950.
(2)
(b)A test is set up.
(i)State the Null hypothesis.
(1)
(ii)State the number of degrees of freedom.
(1)
(iii)The calculated value at the 5% test level is 39.957.
Write down the critical value of at the 5% level of significance.
(1)
(iv)What can you say about gender and grade obtained?
(1)
(Total 6 marks)
(a)Males = (M1)(A1)
= 4950(AG)2
(b)(i)That gender and grade obtained are independent.(A1)
(There is no connection between gender and grade obtained.)
(ii)(3 – 1)(2 – 1) = 2(A1)
(iii)2 =5.991(A1)
(iv)Calculated2 = 39.957
Therefore, reject the Null hypothesis. Gender and grade obtained(R1)4
are dependent (or there is a connection between gender and grade).
[6]
8A researcher consulted 500 men and women to see if the colour of the car they drove was independent of gender. The colours were red, green, blue, black and silver. A test was conducted at the 5% significance level and the value found to be 8.73.
(a)Write down the null hypothesis.
(b)Find the number of degrees of freedom for this test.
(c)Write down the critical value for this test.
(d)Is car colour independent of gender? Give a clear reason for your answer
(Total 6 marks)
(a)Colour of car and gender are independent(A1)(C1)
(b)(2 – 1) (5 – 1)(M1)
= 4(A1)
OR
4(A2)(C2)
(c)2 = 9.488(A1)(C1)
(d)Yes. Test statistic is smaller than the critical value.(A1)(R1)(C2)
[6]
9.Tom performs a chi-squared test to see if there is any association between the time to prepare for a penalty kick (short time, medium time and long time) and the outcome (scores a goal, doesn’t score a goal). Tom performs this test at the 10% level.
(a)Write down the null hypothesis.
(b)Find the number of degrees of freedom for this test.
(c)The p-value for this test is 0.073. What conclusion can Tom make? Justify your answer.
(Total 6 marks)
(a)Time to prepare is independent of outcome, or, there is no association
between time to prepare and the outcome(A1)(C1)
(b)2(A1)(C1)
(c)0.073 < 0.10 For comparing 0.073 with 0.10 or 10%(M1)
For < or saying “less than”(M1)
Reject H0(A1)
Time and outcome are not independent of each other or equivalent
in words relating to the question(A1)(C4)
[6]
10.The eye colour and gender of 500 students are noted and the results are indicated in the table below.
Blue / Brown / GreenMale / 18 / 152 / 50
Female / 40 / 180 / 60
It is believed that eye colour is related to gender in a school in Banff. It is decided to test this hypothesis by using a test at the 5% level of significance.
(a)Write down the null hypothesis for this experiment.
(1)
(b)Show that the number of degrees of freedom is 2.
(1)
(c)Write down the critical value for the degrees of freedom.
(1)
(d)Calculate the test statistic for this data.
(2)
(e)Does the evidence suggest that eye colour is related to gender in this school? Give a clear reason for your answer.
(2)
(Total 7 marks)
(a)Eye colour and gender are independent.
OR
There is no relationship (association) between eye colour and gender.(A1)1
(b)(2 – 1)(3 – 1)(M1)
= 2(AG)1
(c)5.991 (5.99)(A1)1
(d)4.48(G2)2
(e)For comparing 2 test statistic with 2 critical value(A1)
No, eye colour is not related to gender
2 test statistic < 2 critical value(R1)
OR
For comparing their p-value with 0.05
No, eye colour is not related to gender(A1)
p-value of 0.106 > 0.05(R1)2
[7]
11.In a competition the number of males and females taking part in different swimming races is given in the table of observed values below.
Backstroke(100 m) / Freestyle
(100 m) / Butterfly
(100 m) / Breaststroke
(100 m) / Relay
(4 × 100 m)
Male / 30 / 90 / 31 / 29 / 20
Female / 28 / 63 / 20 / 37 / 12
The Swimming Committee decides to perform a χ2 test at the 5 significance level in order to test if the number of entries for the various strokes is related to gender.
(a)State the null hypothesis.
(1)
(b)Write down the number of degrees of freedom.
(1)
(c)Write down the critical value of χ2.
(1)
The expected values are given in the table below:
Backstroke(100 m) / Freestyle
(100 m) / Butterfly
(100 m) / Breaststroke
(100 m) / Relay
(4 × 100 m)
Male / 32 / a / 28 / 37 / 18
Female / 26 / 68 / 23 / b / 14
(d)Calculate the values of a andb.
(2)
(e)Calculate the χ2 value.
(3)
(f)State whether or not you accept your null hypothesis and give a reason for your answer.
(2)
(Total 10 marks)
(a)H0 : number of entries is independent of gender.(A1)1
(b)4(A1)1
(c)9.488(A1)1
(d)a = 85, b = 29(A1)(A1)2
(e)(M1)(A1)
= 6.10 (using given values)(A1)
OR
5.80 (from calculator)(G3)3
(f)Do not reject the null hypothesis as the 2 value is less than the critical value.
So, gender and stroke are independent.(A1)(R1)2
(Also allow “accept”).
[10]
12.(a)For his Mathematical Studies project, Marty set out to discover if stress was related to the amount of time that students spent travelling to or from school. The results of one of his surveys are shown in the table below.
Travel time (tmins)Number of students
↓ / high stress / moderate stress / low stresst 15 / 9 / 5 / 18
15 t 30 / 17 / 8 / 28
30 t / 18 / 6 / 7
He used a χ2 test at the 5 level of significance to find out if there was any relationship between student stress and travel time.
(i)Write down the null and alternative hypotheses for this test.
(2)
(ii)Write down the table of expected values. Give values to the nearest integer.
(3)
(iii)Show that there are 4 degrees of freedom.
(1)
(iv)Calculate the χ2 statistic for this data.
(2)
The χ2 critical value for 4 degrees of freedom at the 5 level of significance is 9.488.
(v)What conclusion can Marty draw from this test? Give a reason for your answer.
(2)
(b)Marty asked some of his classmates to rate their level of stress out of 10, with 10 being very high. He also asked them to measure the number of minutes it took them to get from home to school. A random selection of his results is listed below.
Travel time (x) / 13 / 24 / 22 / 18 / 36 / 16 / 14 / 20 / 6 / 12Stress rating (y) / 3 / 7 / 5 / 4 / 8 / 8 / 4 / 8 / 2 / 6
(i)Write down the value of the (linear) coefficient of correlation for
this information.
(1)
(ii)Explain what a positive value for the coefficient of correlation indicates.
(1)
(iii)Write down the linear regression equation of y on x in the form y = ax + b
(2)
(iv)Use your equation in part (iii) to determine the stress rating for a student who takes three quarters of an hour to travel to school.
(2)
(v)Can your answer in part (iv) be considered reliable? Give a reason for your answer.
(2)
(Total 18 marks)
(a)(i)H0 : level of stress is independent of travel time(A1)
H1 : level of stress is not independent of travel time(A1)(ft)2
(or reasonable equivalents)
(ii)12.15.2414.6
20.18.6824.2
11.85.0814.2(M1)(A1)(G2)
Note:(M1) for attempting to calculate expected values by hand eg
12515
20924
12514
Nearest integers(A1)(G3)3
(iii)df = (r – 1) (c – 1) = (3 – 1)(3 – 1) = 4(M1)(AG)1
(iv)2 = 9.83(1)(G2)2
OR 2 = 9.277 .....if calculated from integer valuesOR (M1)(A1)
(v)For 2 = 9.83 Do not accept H0 :(A1)(ft)
(Level of stress is not independent of travel time or reasonable equivalent)
because or p-value < 0.05(R1)(ft)
OR
For 2 = 9.278 Accept H0 :(A1)(ft)
becauseorp-value 0.05(R1)(ft)2
Note: a correct reason must be given for the (A1) to be awarded.
(b)(i)r = 0.667(A1)1
(ii)Stress rating increases as travel time increases
(or reasonable equivalent eg y increases as x increases).(R1)1
Note:Do not accept “positive correlation”
(iii)y = 0.181x + 2.22
for 0.181x and(A1)
for 2.22(A1)2
Note:For y = 2.22x + 0.181, award (A0)(A1)(ft)
(iv)Puttingx = 45(M1)
0.181× 45 + 2.22
= 10.365 (10.4)(A1)(ft)(G2)2
Notes:Allow 10 or 11 only if the method is shown and is correct.
Allow follow through only if method shown.
(v)not reliable …(A1)
Because result is outside the data range or because the
correlation coefficient not high or the sample is small or
responses are subjective.(R1)2
Note:Award (R1) for any of the above. A correct reason must be given to award the (A1).
13.Oral tests are conducted by three examiners A, B and C separately. The results of the examination are classified as Credit, Pass or Fail. A χ2 test is applied to the data collected in order to test whether or not the examiners differ in their standard of awards.
(a)State the null hypothesis, H0, for this data.
(b)Write down the number of degrees of freedom.
Of the 135 students who sit the exam, 30 get Credit and 45 are tested by examiner A.
(c)Calculate the expected number of students who get a Credit and are tested by examiner A.
Using a 5 level of significance, the p-value is found to be 0.0327 correct to 3 s.f.
(d)State whether H0 should be accepted. Justify your answer.
(Total 6 marks)
(a)H0 = The standard of award is independent of the examiner (or equivalent)(A1)(C1)
(b)4(A1)(C1)
(c)(M1)
(A1)(C2)
(d)No, because the p-value is less than the significance level.(A2)
OR
No, because 0.0327 0.05(A2)(C2)
[6]
14The local park is used for walking dogs. The sizes of the dogs are observed at different times of the day. The table below shows the numbers of dogs present, classified by size, at three different times last Sunday.
Small Medium Large
(a)Write a suitable null hypothesis for a χ2 test on this data.
(b)Write down the value of χ2 for this data.
(c)The number of degrees of freedom is 4. Show how this value is calculated.
The critical value, at the 5 level of significance, is 9.488.
(d)What conclusion can be drawn from this test? Give a reason for your answer.
(Total 6 marks)
(a)Ho: The size of dog is independent of the time of day, (or equivalent)(A1)(C1)
Note:Award (A0) for ‘no correlation’
(b)2 = 4.33. (accept 4.328)(M1)(A1)(C2)
Note: GDC use is anticipated but candidates might calculate this by hand. (M1) can be awarded for a reasonable attempt to use the formula.
(c)(3–1)(3–1) = 4(A1)(C1)
Note:Award mark for left hand side seen.
(d)The hypothesis should not be rejected, (allow ‘accept Ho’)
OR
The size of dog is independent of the time of day(A1)(ft)
4.33 < 9.488 or 0.363 > 0.05(R1)(ft)(C2)
Notes: Allow 2calc2crit only if a value for 2calc is seen somewhere.
Award (R1)(ft) for comparing the values and (A1)(ft) if the conclusion is valid according to the comparison given. If no reason is given, or if the reason is wrong both marks are lost. Note that (A0)(R1)(ft) can be awarded but (A1)(R0) cannot.
[6]
15.Manuel conducts a survey on a random sample of 751 people to see which television programme type they watch most from the following: Drama, Comedy, Film, News. The results are as follows.
Drama / Comedy / Film / NewsMales under 25 / 22 / 65 / 90 / 35
Males 25 and over / 36 / 54 / 67 / 17
Females under 25 / 22 / 59 / 82 / 15
Females 25 and over / 64 / 39 / 38 / 46
Manuel decides to ignore the ages and to test at the 5 level of significance whether the most watched programme type is independent of gender.
(a)Draw a table with 2 rows and 4 columns of data so that Manuel can perform a chi-squared test.
(3)
(b)State Manuel’s null hypothesis and alternative hypothesis.
(1)
(c)Find the expected frequency for the number of females who had “Comedy” as their most-watched programme type. Give your answer to the nearest whole number.
(2)
(d)Using your graphic display calculator, or otherwise, find the chi-squared statistic for Manuel’s data.
(3)
(e)(i)State the number of degrees of freedom available for this calculation.
(ii)State the critical value for Manuel’s test.
(iii)State his conclusion.
(3)
(Total 12 marks)
.(a)
Drama / Comedy / Film / NewsMales / 58 / 119 / 157 / 52
Females / 86 / 98 / 120 / 61
(M1)(M1)(A1)3
(b)H0: favourite TV programme is independent of gender or no association between favourite TV programme and gender
H1: favourite TV programme is dependent on gender (must have both)(A1)1
(c)(M1)
= 105(A1)(ft)(G2)2
(d)12.6 (accept 12.558)(G3)3
(e)(i)3(A1)
(ii)7.815 (accept 7.82)((ft) from their (i))(A1)(ft)
(iii)reject H0 or equivalent statement (eg accept H1)(A1)(ft)3
[12]
16A survey of 400 people is carried out by a market research organization in two different cities, Buenos Aires and Montevideo. The people are asked which brand of cereal they prefer out of Chocos, Zucos or Fruti. The table below summarizes their responses.
