Statistics for Those Who (Think They) Hate Statistics

Week 2 ExercisesDeb Davis - Pg 1

Statistics for Those Who (Think They) Hate Statistics

Chapter 4 – Questions 1-5

C4-Q1-P72

Data set on web -- complete the following:

1a: Frequency Distribution & Histogram

1b: Why the class interval you selected?

I selected intervals of 5 because it made for a reasonable size group.

1c: Is this distribution skewed? How would you know?

This distribution is skewed. This is visually obvious in the Cross Validation from the lower limit of the first bin. It is also apparent from the above bar graph that the distribution is not symmetrical, ergo, it is skewed.

C4-Q2-P72

Frequency distribution given:

Create a histogram:

C4-Q3-P73

Identify these distributions as negatively skewed, positively skewed, or not skewed at all, and why.

3a. This talented group of athletes scored very high on the vertical jump takes.

This distribution is positively skewed as they are a talented group and apparently all scored well.

3b. On this incredibly crummy test, everyone received the same score. This distribution is not skewed at all as despite the “crumminess” of the test, all scores were equal.

3c. On the most difficult spelling test of the year, the third graders wept as the scores were delivered. It is impossible to tell if this is a skewed distribution because the third grades may have wept for joy, for pity, or for either. There is no quantifier to the distribution to indicate what the scores were.

C4-Q4-P73

For each of the following, indicate whether you would use a pie, line, or bar chart, and why.

4a. Proportion of freshmen, sophomores, juniors, and seniors in a particular university would easily lend itself to a pie chart. To visualize these groups by pieces of a pie is very straightforward.

4b. Change in GPA over four semester would likely render best in a bar chart as the visual change in grades could be color monitored to assign terms, and would make tracking extremely visual.

4c. Number of applicant for four summer jobs would again render in a bar chart for the same reasons.

4d. Reaction time to different stimuli would probably best render in a line chart as the details could cloud and otherwise clear pie or bar chart.

4e. Number of scores in each of 10 categories could be rendered in any method, but I would probably use a bar chart because of the clarity of image.

C4-Q5-P73

Provide an example for each of the below and then draw the chart accordingly.

5a. A line graph is well geared for a large groups of numbers from which trends may be gathered. For my area, I would use a line graph to chart the scores on midterms taken over term of teaching.

For example, with a possible score of 200, the following scores were received over the last two terms. Tplies

5b. A bar graph gives excellent comparatives, such as midterm paper grades to final paper grades.

5c. A pie graph would give great information for group totals.

Chapter 5 – Questions 1-8

C5-Q1-P93

Using the following data:

1a: Compute the Pearson product-moment correlation coefficient by hand and show work.

Sum of all correct (X) is 156

Sum of all Attitude (Y) is 797

Sum of each X-squared is 2476

Sum of each Y-squared is 64727

Sum of Products of X and Y is 12568

THEREFORE:

(10 x 12568) - (156 x 797)

=------

Sq rt of [(10 x 2476) - 1562][(10 x 64727) - 7972]

======

125680 - 124332

=------

Sq rt of [24760-24336][647270-635209]

======

134813481348

= ------= ------=------=0.59609

Sq rt of [424][12061]sort 51138642261.39

1b. Construct a scatter plot for these 10 values by hand. Would you expect the correlation to be direct or indirect?

Indirect correlation. Relationship is weak

C5-Q2-P94

Use the below data for 2a and 2b.

Sum of all speed (X) is 235.4

Sum of all strength (Y) is 1730

Sum of each X-squared is 5677.74

Sum of each Y-squared is 313210

Sum of Products of X and Y is 41095.2

THEREFORE:

(10 x 41095.2) - (235.4 x 1730)

=------

Sq rt of [(10 x 5677.4) - 235.42][(10 x 313210) - 17302]

======

410952 - 407242

=------

Sq rt of [56777.4-55413.16][3132100-2992900]

======

134813481348

= ------= ------=------=0.26922

Sq rt of [1364.24][139200]sqrt 1899022081378.5

2b. A low correlation (.27) indicates that the contributing factors may not be a huge influence.

C5-Q3-P94

1a: Compute the Pearson product-moment correlation coefficient.

Sum of all Increase (X) is 38

Sum of all Acht (Y) is 104

Sum of each X-squared is 194

Sum of each Y-squared is 1536

Sum of Products of X and Y is 507

THEREFORE:

(10 x 507) - (38 x 104)

=------

Sq rt of [(10 x 194) - 382][(10 x 1536) - 1042]

======

5070 - 3952

=------

Sq rt of [1940-1444][15360-10816]

======

111811181118

= ------= ------=------=0.7447

Sq rt of [496][4544]sqrt 22538241501.27

The correlation is slightly skewed indicating a relationship between increased budget and increased scores.

Sum of all hours (X) is 153

Sum of all GPA (Y) is 38.73

Sum of each X-squared is 2513

Sum of each Y-squared is 150.99

Sum of Products of X and Y is 593.16

THEREFORE:

(10 x 593.16) - (153 x 38.73)

=------

Sq rt of [(10 x 2513) - 1532][(10 x 150.099) - 38.732]

======

5931.6 - 5925.69

=------

Sq rt of [25130-23409][1500.99-1500.0129]

======

5.915.915.91

= ------= ------=------=0.1441

Sq rt of [1721][0.9771]sqrt 1681.589141.007

A low correlation such as this would indicate a lack of relationship.

Accordingly, the plot is random.

C5-Q5-P05 - A coefficient between two variables is 0.64. The Pearson correlation is 8 [??????]; the relationship is quite strong, and the variance unaccounted is .36 (1-.64).

Chapter 6 - Questions 2-5

C6-Q2-P118

Provide an example of when you would want to establish test-retest and parallel forms reliability.

C6-Q3-P118

You are developing an instrument that measures vocational preferences and you need to administer the test several times during the year while students are attending a vocational program. You need to assess the test-retest reliability of the test and the data from two administrations (Ch6 data set 1) -- one fall and one spring. Would you call this a reliable test? Why or why not?

C6-Q4-P118

How can a test be reliable and not valid, and not valid unless it is reliable?

C6-Q5-P118

When testing any experimental hypothesis, why is it important that the test you use to measure the outcome be both reliable and valid?

Chapter 7 - Questions 1-7 (Note: Teacher will provide the articles for #1)

C7-Q1-P113

Select five empirical research articles and detail the following information:

a-What is the null hypothesis?

b-What is the research hypothesis?

c-Create a null and research hypothesis for own area.

d-identify articles with no clear stated or implied hypothesis. Can a research hypothesis be crafted?

C7-Q2-P113

Why does the scientific method work?

Steps:

Observe

Question

Hypothesize

Experiment

Accept or Reject

Change Hypothesis?

Experiment

Accept or Reject

Etc.

-- The scientific method generally works because of its circular perspective.

C7-Q3-P113

Why do good samples make for good tests of research hypotheses?

Good samples make for good tests of research hypotheses because good samples are directed to incorporate specifics of a directed hypothesis (an educated guess).

C7-Q4-P113

For the following, create one null hypothesis, one directional research hypothesis, and one nondirectional research hypothesis.

a-What are the effects of attention on out-of-seat classroom behavior?

-Diagnostically Severe ADHD students would have the same out-of-seat frequency as those determined to be not ADHD-Severe.

-Diagnostically Severe ADHD students would have more out-of-seat frequency than those determined to be not ADHD-Severe.

-Diagnostically Severe ADHD will differ in out-of-seat frequency than those determined to be not ADHD-Severe.

b-What is the relationship between the quality of a marriage and the quality of the spouses relationships with their siblings?

-Those with a strong quality of marriage will always have a weak quality of sibling relationships.

-Those with a strong quality of marriage will always have a strong quality of sibling relationships.

-Those with a strong quality of marriage will have varying quality of sibling relationships.

c-What’s the best way to treat an eating disorder?

- The best way to treat an eating disorder is always calories-in-calories-out.

- The best way to treat an eating disorder is never calories-in-calories-out.

- The best way to treat an eating disorder is completely dependent upon the cause of the disorder, and even then, treatment may or may not be effective.

C7-Q5-P113

What do we mean when we say that the null hypothesis acts as a starting point?

To start at the null hypothesis allows for all possibilities. When there are a number of unknowns, to start by eliminating as many variables as possible allows for individual test methods.

C7-Q6-P113

Evaluate the hypotheses from C7-Q1 in terms of the five criteria discussed at the end of the chapter.

Hypotheses should:

Be stated in a declarative form

Posit a relationship between variables

Reflect a theory or a body of literature on which they are based

Be brief and to the point, and

Be testable!

C7-Q7-P113

Why does the null hypothesis presume no relationship between variables?

That defines “null” – having no relationship!
C8-Q1-9

C8-Q1-P151

What are the characteristics of the normal curve? The three characteristics of a bell curve are: 1) it is not skewed; 2) it is perfectly symmetrical about the mean; 3) the tails are asymptotic (close to the axis but never quite reaches).

What human behavior is distributed normally? Generally, height and weight are distributed normally in a population. In my classroom, grades turn from a reverse bell to a bell through the course of the term.

C8-Q2-P151

Standard scores, such as z scores, allow us to make comparisons across different samples. Why? A z score is the result of dividing the amount that a raw score differs from the mean of the distribution by the standard deviation. So, scores below the mean will have negative z scores, and scores above the mean will have positive z scores. Positive z scores always fall to the right of the mean, and negative always fall to the left. Remember that z scores across different distributions are comparable.

C8-Q3-P151

Why is a z score a standard score, and why can standard scores be used to compare scores from different distributions with one another? A z score is a standard score because it is based on the degree of variability within its distribution.

C8-Q4-P151

Compute the z scores for the following raw scores where the X-bar is 50 and the standard deviation is 5.

z = (rawscore – mean)/standarddeviation

a. 55

(55-50)/5 = 5/5 = 1

b. 50

(50-50)/5 = 0/5 = 0

c. 60

(60-50)/5 = 10/5 = 2

d. 57.5

(57.5 – 50)/5 = 7.5/5=1.5

e. 46

(46-50)/5 = -4/5 = -.8

5. For the following set of scores, fill in the cells. The mean is 70 and the standard deviation is 8.

z = (rawscore – mean)/standarddeviation

Raw Scorez score

68.0(68-70)/8 = -2/8 = -.25

57.2(x-70)/8 = -1.6

82.0(82-70)/8 = 1.5

84.4(x-70)/8 = 1.8

69.0(69-70)/8 = -0.125

66.0(x-70)/8=-0.5

85.0(85.0-70)/8=1.875

83.6(x-70)/8=1.7

72.0(72.0-70)/8=0.25

6. Questions 6a through 6d are based on a distribution of scores with a mean of 75 and a standard deviation is 6.38.

z = (rawscore – mean)/standarddeviation

a. Wha is the probability of a score falling between a raw score of 70 and 80?

b. What is the probability of a score falling above a raw score of 80?

c. What is a probability of a score falling between a raw score of 81 and 81?

d. What is the probability of a score falling below a raw score of 63?

7. Jake needs to score in the top 10% in order to earn a physical fitness certificate. The class mean is 78 and the standard deviation is 5.5. What raw score does he need to get that valuable piece of paper?

(x-78)/5.5=.9

82.95 minimum required

8. So, why doesn’t it make sense to simply combine, for example, course grades across different topics – just take and average and call it a day? Each raw score is rated to different distributions which will make all the difference.

9. Who is the better student, relative to his or her classmates? Here’s all the information you ever needed to know . . . .

MATH

Class Mean81

Class Standard Deviation 2

READING

Class Mean87

Class Standard Deviation10

z = (rawscore – mean)/standarddeviation
RAW / Mean / SD / z
math-n / 85 / 81 / 2 / 2
math-t / 87 / 81 / 2 / 3
rdg-n / 88 / 87 / 10 / 0.1
rdg-t / 81 / 87 / 10 / -0.6
avg-n / 2 / 0.1 / 2.1 / 1.05
avg-t / 3 / -0.6 / 2.4 / 1.2

Talya is the better student.