STAT 200 On-Line
Guided Exercise 1 /
Be sure to:
·  Please submit your answers in a Word file to Sakai at the same place you downloaded the file
·  Remember you can paste any Excel or JMP output into a Word File (use Paste Special for best results).
·  Put your name and the Assignment # on the file name: e.g. Ilvento Guided1.doc
Answer as completely as you can and show your work.
1.  A new popular statistical item is a poster of a collection of interesting statistics and graphics. The one on the right is a collection of numbers on aging in the U.S. It was put together by an insurance company to emphasize we will be living longer and therefore we will need to plan better for things like retirement, health care, and living arrangements. Some of the data are based on surveys, some from life tables, and some based on models that project into the future (the general source is given on the graph (larger more detailed pictures are given on the second and third pages). The statistics chosen and the way they are presented are designed to catch your interest and stimulate discussion. Please note, it is an insurance company that is presenting these ideas and their goal is to sell products.
The details are better seen on the next two pages and the source is given below (you can also search for “Live Longer Slate” and find it)
http://www.slate.com/articles/health_and_science/prudential/2013/08/why_you_need_to_start_thinking_about_the_big_truths_with_living_longer.html
Review the data and answer the following questions.
a.  What figures stand out to you as being particularly well presented? In other words, which are effective in making their point?
No right or wrong answer here. I like comparing the probabilities of being left handed, blonde or playing an instrument with living to 100.
b.  Do any of the figures seem suspect to you, or based on an agenda, or perhaps presented in a way to distort or bias an issue? I don’t mean to imply there is anything wrong with the numbers, but one might quibble with what is presented (or not) or the way in which they are presented.
No right or wrong answer here. I would have liked more information on the oldest cities. Did cities need to be a certain size? Why did they choose 60+?
Jimmy Fallon already replaced Leno!!!!

2. A researcher in Delaware wanted to see the affect of an education program for hospital patients of heart attacks on their likelihood of returning to the hospital in 30 days (referred to as recidivism). The education program consisted of more involved training on diet, exercise, weight, and sticking to the recommendations of the physician. The education program was given to a random sample of patients during 2011 and the results were compared to a control group who did not receive the training. An analysis of the data show that the group receiving the training had a significant reduction in recidivism compared to the control group.

a)  What is the unit of analysis in this study?

The heart patient

b)  Identify the data collection method for this study

Experimental Design. Since there is a treatment and control group it was a random sample.

c)  Would the study involve descriptive or inferential statistics?

The researcher wanted to describe the data, but she also was interested in inferring to all heart patients admitted to a hospital. So it is Inferential.

d)  What is the population (or sample) of interest to the researchers?

All heart patients admitted to a hospital.

3.  Todd Andrlik, founder and editor of Journal of the American Revolution (allthingsliberty.com), wrote a piece about how young many of the founding fathers were when the Declaration of Independence was first signed in 1776. There were 56 signers of the Declaration of Independence and their ages are given below, sorted by age.

OBS / Person / Age / Gender / State
1 / Thomas Lynch / 26 / Male / South Carolina
2 / Edward Rutledge / 26 / Male / South Carolina
3 / George Walton / 27 / Male / Georgia
4 / Thomas Heyward / 29 / Male / South Carolina
5 / Benjamin Rush / 30 / Male / Pennsylvania
6 / Elbridge Gerry / 31 / Male / Massachusetts
7 / Thomas Jefferson / 33 / Male / Virginia
8 / Thomas Stone / 33 / Male / Maryland
9 / William Hooper / 34 / Male / North Carolina
10 / Arthur Middleton / 34 / Male / South Carolina
11 / James Wilson / 34 / Male / Pennsylvania
12 / Samuel Chase / 35 / Male / Maryland
13 / William Paca / 35 / Male / Maryland
14 / John Penn / 35 / Male / North Carolina
15 / George Clymer / 37 / Male / Pennsylvania
16 / Thomas Nelson, Jr. / 37 / Male / Virginia
17 / Charles Carroll / 38 / Male / Maryland
18 / Francis Hopkinson / 38 / Male / New Jersey
19 / Carter Braxton / 39 / Male / Virginia
20 / John Hancock / 39 / Male / Massachusetts
21 / John Adams / 40 / Male / Massachusetts
22 / William Floyd / 41 / Male / New York
23 / Button Gwinnett / 41 / Male / Georgia
24 / Francis Lightfoot Lee / 41 / Male / Virginia
25 / Robert Morris / 42 / Male / Pennsylvania
26 / Thomas McKean / 42 / Male / Delaware
27 / George Read / 42 / Male / Delaware
28 / Samuel Huntington / 44 / Male / Connecticut
29 / Richard Henry Lee / 44 / Male / Virginia
30 / Robert Treat Paine / 45 / Male / Massachusetts
31 / Richard Stockton / 45 / Male / New Jersey
32 / William Williams / 45 / Male / Connecticut
33 / Josiah Bartlett / 46 / Male / New Hampshire
34 / Joseph Hewes / 46 / Male / North Carolina
35 / George Ross / 46 / Male / Pennsylvania
36 / William Whipple / 46 / Male / New Hampshire
37 / Caesar Rodney / 47 / Male / Delaware
38 / William Ellery / 48 / Male / Rhode Island
39 / Oliver Wolcott / 49 / Male / Connecticut
40 / Abraham Clark / 50 / Male / New Jersey
41 / Benjamin Harrison / 50 / Male / Virginia
42 / Lewis Morris / 50 / Male / New York
43 / George Wythe / 50 / Male / Virginia
44 / John Morton / 51 / Male / Pennsylvania
45 / Lyman Hall / 52 / Male / Georgia
46 / Samuel Adams / 53 / Male / Massachusetts
47 / John Witherspoon / 53 / Male / New Jersey
48 / Roger Sherman / 55 / Male / Connecticut
49 / James Smith / 56 / Male / Pennsylvania
50 / Philip Livingston / 60 / Male / New York
51 / George Taylor / 60 / Male / Pennsylvania
52 / Matthew Thornton / 62 / Male / New Hampshire
53 / Francis Lewis / 63 / Male / New York
54 / John Hart / 65 / Male / New Jersey
55 / Stephen Hopkins / 69 / Male / Rhode Island
56 / Benjamin Franklin / 70 / Male / Pennsylvania

a.  Create a stem and leaf plot of the data (you can do this by “hand” in Word in the table below). To do this you need to decide on the stems and then the leaves.

Stem / Leaf
1
2 / 6 6 7 9
3 / 0 1 3 3 4 4 4 5 5 5 7 7 8 8 9 9
4 / 0 1 1 1 2 2 2 4 4 5 5 5 6 6 6 6 7 8 9
5 / 0 0 0 0 1 2 3 3 5 6
6 / 0 0 2 3 5 9
7 / 0
8

Or

Stem / Leaf
2* / 6 6 7 9
3 / 0 1 3 3 4 4 4
3* / 5 5 5 7 7 8 8 9 9
4 / 0 1 1 1 2 2 2 4 4
4* / 5 5 5 6 6 6 6 7 8 9
5 / 0 0 0 0 1 2 3 3
5* / 5 6
6 / 0 0 2 3
6* / 5 9
7 / 0

a.  Calculate the mean, median, and mode for this data. The sum of all the values is Sum(X) = 2,479.

Mean = Sum(x)/n = 2479/56 = 44.27 or 44.3

Median is the middle value. Since n is even, the median is the average of the 28th and 29th values = 44

Mode is the most frequent value 46 or 50 occur 4 times each. Or you could say the mode is undefined.

b.  Briefly describe the distribution - focus on the shape of the distribution, and whether there are an outliers or strange values

The distribution appears to be a symmetrical, mound shaped distribution with the center in the mid-40s. There are no large outliers.

Below is the output from JMP software.

4. For the Signer of the Declaration of Independence data above, let’s now focus on two nominal level variables – Gender and State.

a.  For gender, how would you summarize the distribution for this variable? Think in terms of how we might describe data to talk about this variable. Is it in fact a variable?

Gender is not a variable, it is a constant. All the signers were male.

b.  For State, there were 13 original colonies. Use the table below to make a frequency table of the information. Then summarize the results in words. You can decide how you might organize the states – alphabetically, by north and south, or frequency order. How you organize the states will help how you can use cumulative frequencies to describe the data.

Connecticut / 4 / 4/56 = .0714 / .0714
Delaware / 3 / .0536 / .1250
Georgia / 3 / .0536 / .1786
Maryland / 4 / .0714 / .2500
Massachusetts / 5 / .0893 / .3393
New Hampshire / 3 / .0536 / .3929
New Jersey / 5 / .0893 / .4821
New York / 4 / .0714 / .5536
North Carolina / 3 / .0536 / .6071
Pennsylvania / 9 / .1607 / .7679
Rhode Island / 2 / .0357 / .8036
South Carolina / 4 / .0714 / .8750
Virginia / 7 / .1250 / 1.0000

JMP can organize it alphabetically or by ascending (or descending order)

5. Below is the data for infant mortality for 44 countries, and the same data for OECD countries. The Organization for Economic Co-operation and Development (OECD) is an international economic organization of 34 countries, founded in 1961 to stimulate economic progress and world trade. It is a forum of countries describing themselves as committed to democracy and the market economy, providing a platform to compare policy experiences, seeking answers to common problems, identify good practices and coordinate domestic and international policies of its members (Wikipedia, https://en.wikipedia.org/wiki/Organisation_for_Economic_Co-operation_and_Development). OECD’s web site provided some data on infant mortality for 44 countries. Infant mortality (the rate of death of children under 1 year of age per 1,000 live births) is a measure of development. The table below has the data for 44 countries and the 34 OECD countries.

a. Create a stem and leaf plot of the data (you can do this by “hand” in Word in the table below by typing in the stems and the leaves). Do this for the 44 countries and the 34 OECD countries.

b. Calculate the mean, median, and mode for this data

c. Briefly describe the distribution - focus on the shape of the distribution, and whether there are an outliers or strange values

The sum of x Sum(x) for all 44 countries is 292.30 and the Sum(x) for 34 OECD countries is 128.20.

COUNTRY / IM / Stem / Leaf / The mean for the 44 countries is 6.64 (292.30/44 = 6.64) while the median is 3.60. Since n=44 is even, the median is the average of the two middle values. The 22nd (3.6) and the 23rdth values (3.6) which results in : (3.6+3.6)/2 = 3.6. The mean is pulled by the extreme values in the data. The most extreme values are for Indonesia (24.5), South Africa (32.8), and India (41.4). There is no single modal value. Three values occur 3 times – 2.5, 2.9 and 3.5.
This distribution is highly skewed with a few extreme outliers.
Iceland / 1.3
Finland / 1.7
Slovenia / 1.7 / 1 / 3 7 7
Estonia / 2.0 / 2 / 0 0 3 4 4 5 5 5 6 8 9 9 9
Japan / 2.0 / 3 / 1 3 5 5 5 6 6 7 7
Norway / 2.3 / 4 / 0 4 4 5 8
Spain / 2.4 / 5 / 0 0 1
Sweden / 2.4 / 6
Czech Rep. / 2.5 / 7 / 0
Denmark / 2.5 / 8 / 2 4
Israel / 2.5 / 9
Austria / 2.6 / 10 / 2 9
Germany / 2.8 / 11
Italy / 2.9 / 12 / 3
Korea / 2.9 / 13 / 0
Portugal / 2.9 / 14
Australia / 3.1 / 15
Switzerland / 3.3 / 16
Belgium / 3.5 / 17 / 5
Ireland / 3.5 / 18
United Kingdom / 3.5 / 19
France / 3.6 / 20
Luxembourg / 3.6 / 21
Greece / 3.7 / 22
Lithuania / 3.7 / 23
Netherlands / 4.0 / 24 / 5
New Zealand / 4.4 / 25
Latvia / 4.4 / 26
Poland / 4.5 / 27
Canada / 4.8 / 28
Hungary / 5.0 / 29
United States / 5.0 / 30
Slovak Rep. / 5.1 / 31
Chile / 7.0 / 32 / 8
Russian Fed. / 8.2 / 33
Costa Rica / 8.4 / 34
Turkey / 10.2 / 35
China / 10.9 / 36
Brazil / 12.3 / 37
Mexico / 13.0 / 38
Colombia / 17.5 / 39
Indonesia / 24.5 / 40
South Africa / 32.8 / 41 / 4
India / 41.4
OECD Country / IM / Stem / Leaf / The mean for the 34 countries is 3.77 (128.20/34 = 3.77) while the median is 3.20. Since n=34 is even, the median is the average of the two middle values. The 17th (3.1) and the 18th values (3.3) which results in : (3.1+3.3)/2 = 3.2. The two measures of center are close, but the mean is pulled somewhat by a few extreme values in the data. The most extreme values are for Mexico (13.0), Turkey (10.2), and Chile (7.0). There is no single modal value. Three values occur 3 times – 2.5, 2.9 and 3.5.
This distribution is slightly skewed with a few extreme outliers. However, compared with the data for all 44 countries, this skew is light.
Iceland / 1.3
Finland / 1.7
Slovenia / 1.7 / 1 / 3 7 7
Estonia / 2.0 / 2 / 0 0 3 4 4 5 5 5 6 8 9 9 9
Japan / 2.0 / 3 / 1 3 5 5 5 6 6 7
Norway / 2.3 / 4 / 0 4 5 8
Spain / 2.4 / 5 / 0 0 1
Sweden / 2.4 / 6
Czech Rep. / 2.5 / 7 / 0
Denmark / 2.5 / 8
Israel / 2.5 / 9
Austria / 2.6 / 10 / 2
Germany / 2.8 / 11
Italy / 2.9 / 12
Korea / 2.9 / 13 / 0
Portugal / 2.9 / 14
Australia / 3.1
Switzerland / 3.3
Belgium / 3.5
Ireland / 3.5
United Kingdom / 3.5
France / 3.6
Luxembourg / 3.6
Greece / 3.7
Netherlands / 4.0
New Zealand / 4.4
Poland / 4.5
Canada / 4.8
Hungary / 5.0
United States / 5.0
Slovak Rep. / 5.1
Chile / 7.0
Turkey / 10.2
Mexico / 13.0

Page XXX of 1