Trajectory: Data Analysis
Grade 7 / Grade 8 / Algebra I / Algebra II and BeyondSampling and Design
Formulate Questions that can be addressed with data and collect, organize, and display relevant data to answer questions / 7.SP. 1 Understand that statistics can be used to gain information about a population by examining a sample of the population; generalizations about a population from a sample are valid only if the sample is representative of that population. Understand that random sampling tends to produce representative samples and support valid inferences.
7.SP.2 Use data from a random sample to draw inferences about a population. Generate multiple samples (or simulated samples) of the same size to gauge the variation in estimates or predictions. For example, estimate the mean word length in a book by randomly sampling words from the book; predict the winner of a school election based on randomly sampled survey data. Gauge how far off the estimate or prediction might be. / S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box plots). / S.IC.3 Recognize the purposes of and differences among sample surveys, experiments, and observational studies; explain how randomization relates to each.
Data Distributions
Select and use statistical methods to analyze data / 7.SP.3 Informally assess the degree of visual overlap of two numerical data distributions with similar variabilities (variation), measuring the difference between the centers by expressing it as a multiple of a measure of variability. For example, the mean height of players on the basketball team is 10 cm greater than the mean height of players on the soccer team, about twice the variability (mean absolute deviation) on either team; on a dot plot, the separation between the two distributions of height is noticeable. / 8.SP.1 Construct and interpret scatter plots for bivariate measurement data to investigate patterns of association between two quantities.
Describe patterns such as clustering, outliers, positive or negative association, linear association, and nonlinear association.
8.SP.2 Know that straight lines are widely used to model relationships between two quantitative variables. For scatter plots that suggest a
linear association, informally fit a straight line, and informally assess the model fit by judging the closeness of the data points to the line.
8.SP.3 Use the equation of a linear model to solve problems in the context of bivariate measurement data, interpreting the slope and intercept. For example, in a linear model for a biology experiment, interpret a slope of 1.5 cm/hr as meaning that an additional hour of sunlight each day is associated with an additional 1.5 cm in mature plant height.
8.SP.4 Understand that patterns of association can also be seen in bivariate categorical data by displaying frequencies and relative frequencies in a two-way table. Construct and interpret a two-way table summarizing data on two categorical variables collected from the same subjects. Use relative frequencies calculated for rows or columns to describe possible association between the two variables. For example, collect
data from students in your class on whether or not they have a curfew on school nights and whether or not they have assigned chores at home. Is there evidence that those who have a curfew also tend to have chores? / S.ID.2 Use statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets.
S.ID.5 Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data.
S.ID.6 Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.
a. Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given functions or choose a function suggested by the context. Emphasize linear and exponential models.
b. Informally assess the fit of a function by plotting and analyzing residuals.
c. Fit a linear function for a scatter plot that suggests a linear association.
S.ID.7 Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of the data.
S.ID.8 Compute (using technology) and interpret the correlation coefficient of a linear fit.
S.ID.9 Distinguish between correlation and causation.
Interpret Results
Develop and evaluate inferences and predictions based on data / 7.SP.4 Use measures of center and measures of variability for numerical data from random samples to draw informal comparative inferences about two populations. For example, decide whether the words in a chapter of a seventh-grade science book are generally longer than the words in a chapter of a fourth-grade science book. / S.ID.3 Interpret differences in shape, center, and spread in the context of the data sets, accounting for possible effects of extreme data points (outliers). / S.ID. 4 Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets for which such a procedure is not appropriate.
Use calculators, spreadsheets, and tables to estimate areas under the normal curve.
S.IC.1 Understand statistics as a process for making inferences about population parameters based on a random sample from that population.
S.IC.2. Decide if a specified model is consistent with results from a given data-generating process, e.g., using simulation. For example, a model says a spinning coin falls heads up with probability 0.5. Would a result of 5 tails in a row cause you to question the model?
S.IC.4 Use data from a sample survey to estimate a population mean or proportion; develop a margin of error through the use of simulation models for random sampling.
S.IC.5 Use data from a randomized experiment to compare two treatments; use simulations to decide if differences between parameters are significant.
S.IC.6 Evaluate reports based on data.
January 24, 2011
Version 1.0