Project AMP Dr. Antonio Quesada – Director, Project AMP
Regression and Data
Lesson Lab Summary Page
Developed by Barbara Adler and Jenny Walls (Akron Firestone High School)
Subject: Core Plus Math I/Algebra II
Grade: 9th/10th
Strands: Algebra and Functions
Data Analysis and Probability
Topic: Regression equations and evaluating data
Objectives:
Strand: Algebra and Functions
- Create and analyze graphs of linear and simple non-linear functions.
Strand: Data Analysis and Probability
- Create, interpret and/or analyze tables, charts, and graphs involving data
- Choose and apply measures of central tendency (mean, median, and mode) and variability (range and visual displays of distribution).
Materials: Graphing calculators, internet access, printer, worksheets
Expected Time: 3 class periods
Regression and Data
Lesson Lab Plan
Algebra Two
Developed by Barbara Adler and Jenny Walls (Akron Firestone High School)
Concepts/Learning and Proficiency Objectives
Strand: Algebra and Functions
- Create and analyze graphs of linear and simple non-linear functions.
Strand: Data Analysis and Probability
- Create, interpret and/or analyze tables, charts, and graphs involving data
- Choose and apply measures of central tendency (mean, median, and mode) and variability (range and visual displays of distribution).
Task Overview
This activity requires students to select statistical data from a given website and then make a statistical analysis of the data. Students create scatterplots, calculate correlation, and find regression models.
Students work in groups of one or two and need access to the internet and to a graphing calculator. When designing this lesson, the assumption was made that students understand the differences and characteristics of types of functions: linear, exponential, quadratic, power, and logarithmic. Students also need to have calculator and conceptual skills relating to entering data into lists, drawing scatterplots on a calculator, calculating regression, and finding regression models.
This lesson could be useful as a summary of a study of the types of functions. Suggested time is three class periods.
Integration Learning Strategies
After distributing the assignment sheet and example, the teacher will give an introduction. The students will then be split into groups of one or two.
The groups will go to the website One of the two activities requires the use of data from that link to U.S. statistics. The other activity requires the use of data from the link to health. The groups will print out a copy of their data tables to use for the rest of the activity, and continue with the worksheet. The teacher will ensure that group members are on-task and monitor results.
The teacher will also wrap-up the lesson, highlighting mathematical concepts. Discussion should include the failure of some regression models in predicting future values.
As an extension, students would search the web for appropriate and meaningful statistics related to a current issue or problem of interest to them. They will use the statistics they find to evaluate the problem and then propose a solution. An example would be a discussion of the issue of gun control supported by statistics on death rates by firearms, by age, sex, and race.
Assessment
The assignment itself is type II assessment. Time should be allocated for oral presentation of groups’ results.
Tools and Resources
(Ohio Labor Market Info)
(Current Population Survey)
(Ohio Department of Health)
(Bureau of Justice Statistics)
(Center for Disease Control)
Worksheet and Example
Activity worksheet and example are on the following pages.
Regression Lines and Data
Student Activity Page 1
Name(s): ______Date:______Per:____
In this activity, you will select interesting current data from a respected website. Once you choose your data, print the data make a scatterplot, and analyze it. In your analysis you will include topics such as correlation and regression. Use the back.
1)Go to Choose the U.S. statistics link and pick a topic that is interesting to you. Your selection must have two lists of numerical data that are compared (e.g.: age, income; year, poverty rate).
Website: ______
Title of Page: ______
Date accessed: ______
Website Sponsor: ______
2)Follow your teacher’s guidelines to set size of page prior to printing. Print out your data selection. Attach your printout to this page.
3)Identify independent and dependent variables. Using your graphing calculator, enter your data into lists. Make a scatterplot of the data. Make an accurate drawing, labeling your axes with numbers and titles.
4) Are there any outliers? If so, what factual events or circumstances might explain them?
5)Does your data appear to be a linear, exponential, power, or quadratic function? Why?
6)Choose an appropriate regression model and find the equation. Try at least two models.
Model One:Model Two:
7)Which one fits the data the best? Why? Include correlation.
8)Enter the best equation model into your calculator, and graph it.
• Write a question that can be answered by interpolation. Answer it.
• Write a question that can be answered by extrapolation. Answer it.
Regression Lines and Data
Student Activity Page 2
Name(s): ______Date:______Per:____
In this activity, you will select interesting current data from a respected website. Once you choose your data, print the data make a scatterplot, and analyze it. In your analysis you will include topics such as correlation and regression. Use the back..
1)Go to Choose the health link and pick a topic that is interesting to you. Your selection must have two lists of numerical data that are compared (e.g.: age, income; year, poverty rate).
Website: ______
Title of Page: ______
Date accessed: ______
Website Sponsor: ______
2)Follow your teacher’s guidelines to set size of page prior to printing. Print out your data selection. Attach your printout to this page.
3)Identify independent and dependent variables. Using your graphing calculator, enter your data into lists. Make a scatterplot of the data. Make an accurate drawing, labeling your axes with numbers and titles.
4) Are there any outliers? If so, what factual events or circumstances might explain them?
5)Does your data appear to be a linear, exponential, power, or quadratic function? Why?
6)Choose an appropriate regression model and find the equation. Try at least two models.
Model One:Model Two:
7)Which one fits the data the best? Why? Include correlation.
8) Enter the best equation model into your calculator, and graph it.
• Write a question that can be answered by interpolation. Answer it.
• Write a question that can be answered by extrapolation. Answer it.
Regression Lines and Data
Example
Name(s): Example Date:______Per:____
In this activity, you will select interesting current data from a respected website. Once you choose your data, print the data make a scatterplot, and analyze it. In your analysis you will include topics such as correlation and regression. Use the back.
1)Go to Choose the U.S. statistics link and pick a topic that is interesting to you. Your selection must have two lists of numerical data that are compared (e.g.: age, income; year, poverty rate).
Website:
Title of Page: Median Age of First Marriage
Date accessed: June 20, 2000
Website Sponsor: U.S. Census Bureau
2)Follow your teacher’s guidelines to set size of page prior to printing. Print out your data selection. Attach your printout to this page. (See next page)
[EJW1]3)Identify independent and dependent variables. Using your graphing calculator, enter your data into lists. Make a scatterplot of the data. Make an accurate drawing, labeling your axes with numbers and titles.
1880 1920 1960 2000
Year
4) Are there any outliers? If so, what factual events or circumstances might explain them?
Data for years 1950 and 1960 are distinctly low, perhaps a result of World War II.
5)Does your data appear to be a linear, exponential, power, or quadratic function? Why?
The data appear to follow a quadratic or cubic model. It decreases slightly, and then increases quickly. We know it is not linear, because it is not a constant rate of change. We know it is not exponential, because it does not increase at the same ratio.
6)Choose an appropriate regression model and find the equation. Try at least two models.
Model One: QuadraticModel Two: Cubic
Y=.001x^2 - .107x + 25.7y = .0000183x^3 - .105x^2 + 204.03x - 130888
7)Which one fits the data the best? Why? Include correlation.
The cubic is better.
The correlation, R, of the quadratic model is .92. The correlation, R, of the cubic model is .98.
8) Enter the best equation model into your calculator, and graph it.
• Write a question that can be answered by interpolation. Answer it.
What was the median age at first marriage for a woman in 1955? 20.538 years
• Write a question that can be answered by extrapolation. Answer it.
What will be the median age in 2010? 28.896 years
Median Age at First Marriage
Year Males Females
189026.122.0
1900 25.9 21.9
1910 25.1 21.6
1920 24.6 21.2
1930 24.3 21.3
1940 24.3 21.5
1950 22.8 20.3
1960 22.8 20.3
1970 23.2 20.8
1980 24.7 22.0
1990 26.1 23.9
1993 26.5 24.5
1994 26.7 24.5
199526.924.5
199627.124.8
199726.825.0
199826.725.0
Source: U.S. Bureau of the Census; Web:
[EJW1]1