LSP 121-405

Homework #3

Due: Thursday, October 22nd, 2009 by 11:59 pm

50 points

You will need to use the SPSS PASW software program to solve the following problems. This software is only available in the DePaul labs. You can also access it via remote access to ctiterminals.cti.depaul.edu terminals, as described in the terminal servers document on the COL web site. There is also a freeware version of PASW called PSPP written for Linux. It can be run on Windows or Mac OS X, but it is challenging to install and get it running.

Part A. Basic Descriptive Statistics

The following data sets give waiting times in minutes for 15 customers at bus terminals in Atlanta and Boston.

Atlanta: 5.5 6.0 4.5 5.0 7.0 6.5 5.0 7.5 5.5 4.0 8.0 9.2 5.5 8.3 4.2

Boston: 5.5 8.0 2.0 5.0 8.5 12.0 1.5 6.5 9.5 10.0 6.0 6.8 9.3 12.5 7.7

1.  (4 points) Find the mean, median, and range for each of the two data sets.

2.  (4 points) Give the five-number summary for each.

3.  (4 points) Find the standard deviation for each of the two data sets.

4.  (4 points) There is a rule of thumb that says you can estimate the value of the standard deviation for any data set by taking the range divided by 4. Use this rule to estimate the standard deviation of each of the two data sets. How close is this estimate to the real standard deviation in each case?

5.  (4 points) Based on all your results, compare the two data sets in terms of their central measures (mean and median) and dispersion measures (quartiles and standard deviation).

Part B. Correlation

You will now look at the correlation between several variables by using Scatterplots and Correlation analysis. In PASW, to show a Scatterplot from a Data window go to Graphs->Chartbuilder, then click Scatter/Dots, drag the first plot example box (leftmost in top row) up into the preview area, then drag each of the variables to be graphed to the X-axis and Y-axis areas on the preview, and click OK. To calculate the Pearson R correlation value for two variables, click Analyze->Correlate->Bivariate. Then move the two variables to be analyzed into the Variables box and click OK.

6.  First, you will address the question “Is there any correlation between smoking by pregnant women and low birth weight?” Download the file CigarettesBirthweight.xls from the QRC site. Edit the Excel file to clean it up and then import it into PASW.

a.  (5 points) Draw a scatterplot of the two variables “# Cigarettes per day” and “Birth weight (lbs)”. Copy and paste the scatterplot into your answers.

b.  (5 points) Do a correlation analysis to determine the Pearson R value. Copy and paste the results from PASW into your answers. What is the R value? Is there any correlation between these variables? What type of correlation (positive, negative) and how strong (low, medium high) is the correlation?

7.  Now let’s check if there’s any correlation between eating lots of animal fat and dying from certain cancers. Download the file CancerAnimalFat.xls from the Olderdata section of the QRC site (click Olderdata at the bottom of the Excel Files page). Clean it up in Excel and import it into PASW.

a.  (4 points) Draw a scatterplot of the two variables “Animal Fat Intake (gm/day)” and “Age Adjusted Mortality”. Copy and paste the scatterplot into your answers.

b.  (4 points) Do a correlation analysis to determine the Pearson R value. Copy and paste the results from PASW into your answers. What is the R value? Is there any correlation between these variables? What type of correlation (positive, negative) and how strong (low, medium high) is the correlation?

8.  Download the file SurveySpring98.xls from the Olderdata section of the QRC site (click Olderdata at the bottom of the Excel Files page). Clean it up in Excel and import it into PASW. This file contains results of a survey given to DePaul students in 1998.

a.  (4 points) Does beer drinking affect hours spent studying? Do the PASW scatterplot and correlation analysis of the variables “Beers drunk during typical week” vs. “Hours study per week” and copy into your answers. Is there a correlation? And, if so, what direction and how strong is it?

b.  (4 points) Do students who flirt a lot tend to cheat on their boy/girlfriends? Do the PASW scatterplot and correlation analysis of the variables “Flirt with others while dating someone” vs. “I have cheated on my boyfriend/girlfriend in the past” and copy into your answers. Is there a correlation? And, if so, what direction and how strong is it?

c.  (4 points) Do older students tend to be more or less satisfied with DePaul than younger students? Do the PASW scatterplot and correlation analysis of the variables “Age” vs. “Satisfied with the education I am receiving at DePaul” and copy into your answers. Is there a correlation? And, if so, what direction and how strong is it?