Tips for Students Preparing to Take the AP Statistics Exam
Rich Lambert, UNC Charlotte
Brenda Goforth, Shelby High School
David Wilcox, Christ Church Episcopal School
Most of the questions have multiple parts. Read all of the parts before you start answering. Do not assume that you know the general topic of a question and launch into a strategy without reading the entire question. For example, a regression question will often include some component that involves descriptive statistics.
Multiple part questions - If the last part asks you to answer a question based on your results to the previous parts, be sure to use your prior results in your answer. If you could not do the previous part required for the question, make up an answer and explain what you would have done.
Use complete sentences throughout your written responses. Write as clearly and simply as possible. Use vocabulary carefully. Statistics terms have specific meanings and there is no poetic license.
Make sure that your answersare written in the context of the problem. This is especially important when defining symbols and variables, and writing conclusions. For example, from the 2005 exam #6a for the interpretation of the confidence interval, if the student had said “I am 95% confident that the true population mean lies between -16.57 and -9.434”, this was counted incorrect even though the student made a true statement and used the correct interval. The correct answer would have been “I am 95% confident that the true population difference in the mean amount of lead on a child’s dominant hand after an hour of play inside versus an hour of play outside is between -16.57 and -9.434”.
Do not simply make lists of all things you think you know about a topic. You may be penalized for extraneous information, particularly if it is incorrect or not relevant to the problem. The graders use what is called the "parallel solutions rule". This means they are trained to count only the weakest answer if you provide several different answers or approaches to the same question. They have no way of knowing whether you know which answer is correct so in order to be fair, they have to grade the weakest answer.
Show all your steps. A correct answer without the steps may receive little or no credit. Similarly, acorrect answer without adequate explanations may receive little credit. However, an incorrect answer with all the correct steps shown and correct interpretation but some minor mathematical errors may receive substantial credit.
Do not simply copy the screen of the calculator in isolation. Demonstrate with your explanations that you know what the screen means.
Link conclusions to your numbers. Don't just say "I reject Ho and conclude that the mean height is greater than 65 inches." This does not say WHY you rejected Ho. Instead, say "Since the p-value of .0025 is less than the alpha level of 0.05, I reject Ho and ...”.
Be consistent. Make sure your hypotheses and conclusion match. If you find an error in your computations, change your conclusion if necessary. Even if your numbers are wrong, you will normally get credit for a conclusion that is correct for your numbers.
Interpreting a confidence interval is different from interpreting the confidence level. Confidence interval: "I am 95% confident that the proportion of students who own cell phones is between 90% and 95%". Confidence level: "If this procedure were repeated many times, approximately 95% of the intervals produced would contain the true proportion of students who own a cell phone".
Check assumptions/conditions. Checking assumptions is not the same thing as merely stating them. Checking means actually showing that the assumptions are met. Ex. np>10 is not sufficient. Instead np=120(.30)=36>10.
Don't always assume normality. It is perfectly reasonable to expect a question when normality is not present in the data.
Hypotheses are about populations. The point of a hypothesis test is to reach a conclusion about a population based on a sample. We don't need to make hypotheses about samples. When writing hypotheses, conclusions, and formulas, exercise care with your wording and symbols so that you do not incorrectly use population and sample.
Stay committed to a position that you take in any of your answers. Defend your interpretations with statistical evidence from analyses or procedures and from the information given in the problem. The only exception to this rule will be a question that specifically asks you to identify weaknesses in a study design or the potential influence of confounding variables.
Remember that correlation is not causality. Do not overstate your conclusions. Similarly, single studies rarely PROVE anything. Replication and validation are key elements of the scientific process. Therefore, be careful not to use overly strong language in your interpretations. Study designs are rarely flawless and some questions will focus on the weaknesses of study design.
When examining a distribution for outliers, look at both sides. If you find outliers in the upper tail, that is not your clue to stop looking. Similarly, if asked to report confidence intervals, report both sides of the interval.
A probability value cannot exceed one. Some students will forget to express probability values as numbers between 0 and 1. There are important differences between proportions, percentages, and percentiles. Watch your language here, as it is easy to interchange these terms.
Experiments vs. samples - Many students confuse experimentation with sampling or try to incorporate ideas from one into the other. The purpose of sampling is to estimate a population parameter by measuring a representative subset of the population. We create a representative sample by selecting subjects randomly using an appropriate technique. The purpose of an experiment is to demonstrate a cause and effect relationship by controlling extraneous factors. Sampling is involved in conducting an experiment, but a random sample alone does not allow you to make statements of cause and effect.
Blocking vs. stratifying - In each we divide up subjects before random assignment or selection, but the words are definitely not interchangeable. The easiest way to remember is that blocking is for experiments and stratifying is for samples.
Blocking - You block on some factor that you think will impact the response to the treatment. The blocking is not random. The randomization occurs within each block essentially by creating two or more miniature experiments. The blocks should be homogeneous with respect to the blocking factor.
Stratifying - In stratified sampling we divide the population based on some factor we believe is important. Then we randomly select subjects within each strata.
Keep a positive tone to your answers. Do not wander off into criticisms of the test, your teacher, your school, your parents, or anything else. Stick to the questions you are asked and do not answer questions you are not asked.
If you are using a statistical calculator that is not a TI, mention the calculator you are using in your answer. If you do not use a statistical calculator and you refer to a table provided in the test booklet, describe in detail how you are using the table. Do not leave the grader to guess where you found a table value.
Remember that a different grader will grade each question. Do not assume the grader will remember anything you said in an answer to a previous question. In almost all cases, the grader will not have seen any of your other answers except in passing as they turn the pages of your booklet to the question they are grading.
Budget your time. Remember that question 6 typically takes more time than the other questions. A good rule of thumb is to allow 25 minutes for #6 and 10 minutes each for the other five questions. This will allow some extra for review. It may be best to read through 1-5, complete the questions you feel comfortable with, go on to #6, and then come back to the remaining questions. You do not want to run out of time before attempting #6 because it counts for more points.
Here is a simple memory device that may help you remember the components of a good discussion of your approach to a hypothesis testing problem.
PleasePopulation
HelpHypotheses
AAssumptions
TroubledTest
AwkwardAlpha
ChildComputation
DoDecision
StatisticsStatement of conclusions in context of the problem
Remember the IQR is a number. Many students write things like "The IQR goes from 11 to 19". Every grader knows exactly what you mean, namely, "The box in my boxplot goes from 11 to 19", but this statement is not correct. The IQR is defined as Q3-Q1 which gives a single value. Writing the statement above is like saying "8 goes from 11 to 19". It doesn't make any sense.
Practice constructing simple graphs by hand. Label, label, label. Any graph that you are asked to draw should have clearly labeled axes with appropriate scales.
Refer to graphs explicitly. When answering questions based on a graph(s), you need to be specific. Don't just say "The female times are clearly higher that the male times." Instead say "the median female time is higher than the first quartile of the male times." Back up your statements by marking on the graph. Graders look at everything you write and if you mark on the graph it can make the difference between two scores.