Chi-squared test
Supplemental Instruction
IowaStateUniversity / Leader: / Kalyca
Course: / BIOL 313
Instructor: / Vollbrecht

Introduction: This note sheet discusses how to use the chi-squared test.

Material

The chi-squared test is a statistical test that will either show that your expected genotypic or phenotypic ratio is wrong or that it is not necessarily wrong. Technically, we don’t actually show that our proposed ratio is right; we just show that we can’t prove that we are wrong. For the sake of simplicity though, the test and this worksheet will talk about accepting the null hypothesis as shorthand for this idea of failing to prove ourselves wrong.

So where do we start with the chi-squared test? We have, at very least, a set of data which matches a count of offspring to certain phenotypes or genotypes. Most often, the question will provide you with an expected ratio to work from, but we will also discuss how to recognize ratios here. Let’s use the following data from a fruit fly cross:

Normal Wings / Miniature Wings
70 / 30

The first thing we’ll assume here is that the miniature wing phenotype is recessive to normal wings. We don’t have ratios that would work otherwise. For normal dominant – recessive allele relationships, there are 3 phenotypic ratios you’ll run into:

  1. 100% for a certain trait. This ratio (1:0) implies that one of the parents was homozygous dominant. You can’t tell anything about the other parent, but you do know that at least one is homozygous dominant.
  1. 3:1 is the ratio indicative of a heterozygote – heterozygote cross.
  1. 1:1 is the ratio indicative of a heterozygote crossed with a recessive homozygote.

For monohybrid crosses, this is all you will see under standard conditions. You can develop few more ratios if you can distinguish between genotypes, but this is sufficient for now. So, we now need to decide on our assumed ratio. For the above data, 1:0 and 1:1 are both clearly pretty far off. The 3:1 ratio is most likely, so we will proceed under the assumption that both parents were heterozygous.

To perform the calculation for chi-squared, we need to know what values we would have expected from the above cross. So, count up the total number of offspring counted:

70 + 30 = 100

Now, use the ratio you assumed to solve for the expected number of offspring of each type. In this case, we expected 3 dominant: 1 recessive in terms of traits expressed. This is the same as ¾ dominant and ¼ recessive.

Expected Dominant = ¾ * 100 = 75

Expected Recessive = ¼ * 100 = 25

Now we can do the chi squared calculation as below:

So our chi-squared value is 1.33. Now, in order to see what that number means, we need to use a chi-squared distribution table like below:

Chi-squared values are located below the second bold line marked by the arrow. These relate to the number we just calculated. The probability values (p-value) associated with each chi-squared value is located above that bolded line. The probability values indicate the likelihood that you would have observed a greater difference between your observed and expected values. We want to have high p-values which is associated with having low chi-squared values.

So, to read the table fully, we need to know our degrees of freedom. To find your degrees of freedom, count the number of categories your counts came in and subtract 1. In our example, we had 2 categories – normal wings and miniature wings – so we have 2 – 1 = 1 degree of freedom. This means we’re only going to be looking at the top row of chi-squared values.

A bit of vocab, your null hypothesis is that you are right; that the ratio you expected is the ratio that you are observing. Rejecting the null hypothesis means your ratio is wrong. Accepting the null hypothesis means your ratio is right. In this case, our value of 1.33 falls between 1.32 and 2.71 in the chi-squared values of the first line of the table above. This corresponds to falling between p-values of 0.25 and 0.10. Most often, the critical p-value we will use – the p-value at which we’ll reject our null hypothesis – is 0.05. So we accept our null hypothesis if our p-value is great than 0.05 and we will reject our null hypothesis if our p-value is equal to or less than 0.05. So for our example, 0.10 to 0.25 is definitely greater than 0.05, so we would accept our null hypothesis; 3:1 should be the right ratio.

A few examples now to hammer home what’s going on:

Example 1:

AA / Aa / aa
34 / 62 / 36

Here, we’re looking at genotypes, so our ratios can be somewhat more involved. Hopefully, it should be fairly obvious that this looks like a 1:2:1 ratio which is indicative of a heterozygote – heterozygote cross. So let’s count the number of offspring in total:

34 + 62 + 36 = 132

Now, solve for our expected counts:

AA: 1:2:1 = ¼

¼ * 132 = 33

Aa: 1:2:1 = 2/4 = ½

½ * 132 = 66

aa: 1:2:1 = ¼

¼ * 132 = 33

Now calculate your chi-squared:

For the last steps, you need to figure out the degrees of freedom and then compare it to the chi-square table.

“AA” + “Aa” + “aa” – 1 = 3 – 1 = 2

So we’re looking at the second row of chi-squared values to determine if our ratio is accurate or not. Our chi-squared of 0.55 falls between 0.211 and 0.575 which corresponds to the p-values 0.90 and 0.75. Since 0.75 to 0.90 are greater than 0.05, we accept our null hypothesis; a 1:2:1 ratio explains the data.

Example 2:

AB / Ab / aB / ab
141 / 40 / 49 / 10

Here we have a tougher chi-squared test to figure out. We have two traits with two alleles each: A with the dominant A and recessive a alleles; B with the dominant B and recessive b alleles. Now, just by observation you may be able to tell that this looks very approximately like a 9:3:3:1 ratio. To make it more clear, though, consider just the two traits by themselves:

A (AB + Ab) / a (aB + ab)
181 / 59

Since we’re choosing between a 1:0 (100%), 1:1, and 3:1 ratio here, 3:1 appears to be the most accurate. So we’ll assume that our parents are Aa and Aa.

B (AB + aB) / b (Ab + ab)
190 / 50

Once again, this most closely resembles a 3:1 ratio. We’ll assume that the parents are Bb and Bb. So they’re total genotype is AaBb and AaBb, just as how we would assume it to be for a 9:3:3:1 ratio.

So now let’s figure out our expected values:

Total count = 141 + 40 + 49 + 10 = 240

AB Expected = 9/16 * 240 = 135

Ab Expected = 3/16 * 240 = 45

aB Expected = 3/16 * 240 = 45

ab Expected = 1/16 * 240 = 15

Remember. All of those fractions are from the ratio we’re guessing. 9/16 is the same as 9:7 which is the same as 9:3:3:1 so long as we’re only worrying about the 9. If you want help with figuring out the ratios, feel free to email me.

So now that we have our expected values, let’s calculate chi-squared:

No we have our chi-squared value! Let’s find degrees of freedom and finish it up:

“AB” + “Ab” + “aB” + “ab” – 1 = 4 – 1 = 3

So we’re looking to see where 2.84 lies in the third row of the table. So 2.84 falls between 2.366 and 4.11 which corresponds to 0.50 and 0.25. Since 0.25 to 0.50 are both greater than 0.05, we accept our null hypothesis again. A 9:3:3:1 ratio is still expected.

These are all the examples I’ll do. If you have any questions about any of this, let me know!