3.

a. What score was earned by more students than any other score? Why? 84. It's the mode.

b. How many students scored between 68 and 94 on the exam? 25 students. Because 50% of the students lie between the 1st and 3rd quartiles and there are 50 students.

c. What was the highest score earned on the exam? 98 (see below for calculations

(a+b)/2 = 72 (that's the midrange)

b-a = 52 (that's the range)

So we can solve these equations: b= 52+a

(2a+52)/2 = 72 --> a+26 = 72 --> a=46 --> b=98.

d. What was the lowest score earned on the exam? 46 (see above for calculations)

e. How many students scored within three standard deviations of the mean ? By Chebyshev's Theorem, at least 1-1/9 = 0.889% or 44.44 students scored within that interval.

Show your calculations leading to your standard deviation and variance on #2.
2.A math test was given with the following results:
80, 69, 92, 75, 88, 37, 98, 92, 90, 81, 32, 50, 59, 66, 67, 66
Find the range, standard deviation, and variance for the scores.

To find the range we subtract the minimum number from the maximum number:

32, 37, 50, 59, 66, 66, 67, 69, 75, 80, 81, 88, 90, 92, 92, 98

Range = 98-32 = 66

Finding the variance, it’s helpful to use a table:

x / x-(x-bar) / (x-(x-bar)^2)
32 / -39.375 / 1550.391
37 / -34.375 / 1181.641
50 / -21.375 / 456.891
59 / -12.375 / 153.141
66 / -5.375 / 28.891
66 / -5.375 / 28.891
67 / -4.375 / 19.141
69 / -2.375 / 5.641
75 / 3.625 / 13.141
80 / 8.625 / 74.391
81 / 9.625 / 92.641
88 / 16.625 / 276.391
90 / 18.625 / 346.891
92 / 20.625 / 425.391
92 / 20.625 / 425.391
98 / 26.625 / 708.891
x-bar / 71.375 / 5787.75 / <--- sum of this column
385.85 / <--- divide the sum by n-1

So the variance is 385.85.

The standard deviation is simply the square root of the variance: sqrt(385.85) = 19.64

Be sure to include your calculations to help me find your errors. #6
6.An animal trainer obtained the following data (Table A) in a study of reaction times of dogs (in seconds) to a specific stimulus. He then selected another group of dogs that were much older than the first group and measure their reaction times to the same stimulus. The data is shown in Table B.
Table ATable B
ClassesFrequencyClassesFrequency
2.3 – 2.9102.3 – 2.91
3.0 – 3.6123.0 – 3.6 3
3.7 – 4.3 63.7 – 4.34
4.4 – 5.084.4 – 5.016
5.1 – 5.745.1 – 5.714
5.8 – 6.425.8 – 6.44
Find the variance and standard deviation for the two distributions above. Compare the variation of the data sets. Decide if one data set is more variable than the other.
Be sure to include your calculations.

The formula for the variance of grouped data is:

Where f is the frequency in a particular category and m is the midpoint in a particular category.

We can use a table to find this answer as well (note that there are 42 observations in each table):

TABLE A

Class
Low / High / Midpoint, m / Frequency, f / f*m / f*m^2
2.3 / 2.9 / 2.6 / 10 / 26 / 67.6
3 / 3.6 / 3.3 / 12 / 39.6 / 130.68
3.7 / 4.3 / 4 / 6 / 24 / 96
4.4 / 5 / 4.7 / 8 / 37.6 / 176.72
5.1 / 5.7 / 5.4 / 4 / 21.6 / 116.64
5.8 / 6.4 / 6.1 / 2 / 12.2 / 74.42
sum(f*m)= / 161 / 662.06 / <---- sum of f*m^2
(sum(f*m))^2= / 25921
((sum(f*m))^2)/n= / 617.167

So the variance is

TABLE B

Class
Low / High / Midpoint, m / Frequency, f / f*m / f*m^2
2.3 / 2.9 / 2.6 / 1 / 2.6 / 6.76
3 / 3.6 / 3.3 / 3 / 9.9 / 32.67
3.7 / 4.3 / 4 / 4 / 16 / 64
4.4 / 5 / 4.7 / 16 / 75.2 / 353.44
5.1 / 5.7 / 5.4 / 14 / 75.6 / 408.24
5.8 / 6.4 / 6.1 / 4 / 24.4 / 148.84
sum(f*m)= / 203.7 / 1013.95 / <---- sum of f*m^2
(sum(f*m))^2= / 41493.69
((sum(f*m))^2)/n= / 987.945

And the variance is

1.Explain the difference between a discrete and a continuous random variable. Give two examples of each type of random variable.

A discrete random variable takes on countable steps. For example, if you were going to count the number of traffic accidents at an intersection on a particular day. Or if you were counting the number of defects for a particular day on a production line.

A continuous random variable can take on any value in a particular interval. For example, if you were going to count the time between calls at a phone bank. Or if you were going to measure the weight of a particular item.

2.Determine whether each of the distributions given below represents a probability distribution. Justify your answer.
a)
x 1 2 3 4
P (x) 1/8 1/8 3/8 1/8

No, because the probabilities sum to less than 1.

b)
x 3 6 8
P (x) 0.2 0 1

No, because the probabilities sum to more than 1.
c)
x 20 30 40 50
P (x) 0.3 0.2 0.1 0.4

Yes, the probabilities sum to 1.

3.Four cards are selected, one at a time, from a standard deck of 52 cards. Let x represent the number of aces drawn in a set of 4 cards.
a.If this experiment is completed without replacement, explain why x is not a binomial random variable.
When you don't replace the cards in the deck, the number of cards in the deck changes, so the probability of drawing an ace isn’t the same after the first draw as it is in the second draw. The binomial distribution requires a series of identical, independent Bernoulli trials.
b.If this experiment is completed with replacement, explain why x is a binomial random variable.
This is because each draw is identical in probability to the last and the results of the previous draw to not influence the current draw. That satisfies all the requirements of the binomial distribution: independent, identical Bernoulli trials.

4.How does the bell-shaped curve for the sampling distribution of sample means for samples of size n = 100 compare to the bell-shaped curve for the sampling distribution of sample means for samples of size n = 60?

The bell shaped curve for the sample mean for a sample of 100 is taller and narrower than the one with sample size 60 because the variance of the sample average is var(x)/n. Larger sample sizes result in smaller variances, so the curve is higher and more narrow when the sample size is 100 than it is when it is 60.

5.What are the characteristics of the normal distribution? Why is the normal distribution important in statistical analysis? Provide an example of an application of the normal distribution.

It is a continuous, symmetric distribution with a tell-tale bell shape. Its two parameters are the mean and the standard deviation.

The Normal distribution is frequently used in working the sample mean. The central limit theorem gives the result that the sample average is approximately normal when the sample size is pretty big (e.g., over 30). So, one of the most common applications of the normal distribution is in working with sample average. Other applications include, regression, the t-test, and ANOVA.

6.In your own words describe the standard normal distribution. Explain why it can be used to find probabilities for all normal distributions.

The standard normal distribution is a special case of the normal distribution. It has mean=0 and standard deviation =1. It can be used to find probabilities for all normal distributions because the formula z=(x-bar - mean)/(standard deviation) converts any observation from any normal distribution to the standard normal distribution.

7.Explain why the normal distribution can be used as an approximation to the binomial distribution. What conditions must be met to use the normal distribution to approximate the binomial distribution? Why is a correction for continuity necessary?

As long as n*p>5 and n*(1-p)>5, then we can approximate the binomial distribution with the normal distribution. The central limit theorem allows this. Typically, these requirements are met through having a large sample size, also one of the requirements of the central limit theorem.

8.Consider a binomial distribution with 15 identical trials, and a probability of success of 0.5
a.Find the probability that x = 2 using the binomial tables

P(X=2) = 0.0032

b.Use the normal approximation to find the probability that x = 2

Using the continuity correction factor, we need to calculate P(1.5<x<2.5) where x follows a normal distribution with mean = n*p = 7.5 and a standard deviation = sqrt(n*p*(1-p)) = 1.936.
So, finding the z-scores gives:
Z= (1.5 - 7.5)/1.936 = -3.098
Z= (2.5 - 7.5)/1.936 = -2.582
We can find, from the table, that P(-3.098<z<-2.582) = 0.005- 0.001 = 0.004

9.The diameters of oranges in a certain orchard are normally distributed with a mean of 5.26 inches and a standard deviation of 0.50 inches.
a) Find the z-score:
(4.5-5.26)/0.5 = -1.52
From the table we find that P(z<-1.52) = 0.064.
So 6.4% of the oranges have diameter less than 4.5 inches.
b) Find the z-score:
(5.12-5.26)/0.5 = -0.28
From the table, we find that P(z>-0.28) = 0.610
So, we expect that 61% of the oranges to have diameter more than 5.12 inches.
c.A random sample of 100 oranges is gathered and the mean diameter obtained was 5.12. If another sample of 100 is taken, what is the probability that its sample mean will be greater than 5.12 inches?
Since we know that the original observations follow a normal distribution, we know that the sample mean follows a normal distribution as well. In this case, the mean is 5.26 inches with standard devaition of 0.5/sqrt(100) = 0.05
So, we can find the z-score:
(5.12-5.26)/0.05 = -2.8
And we can find from the table that P(z>-2.8) = 0.997
That is we can expect that 99.7% of the time, a sample of 100 oranges will have diameter more than 5.12 inches.
d.Why is the z-score used in answering (a), (b), and (c)?
Because we have access to a standard normal table. Without the standard normal table, we wouldn't be able to find the probabilities.
e.Why is the formula for z used in (c) different from that used in (a) and (b)?
In (a) and (b) we are working with probabilities that relate to individual observations. I problem (c) we are working with the sample mean. They follow a different distribution (i.e., they have a different standard deviation) so the formula is slightly different to calculate their z-score.

10.Assume that the population of heights of male college students is approximately normally distributed with mean m of 68 inches and standard deviation s of 3.75 inches. A random sample of 16 heights is obtained.
x is normally distributed with a mean of 68 inches and a standard deviation of 3.75 inches.
b.Find the proportion of male college students whose height is greater than 70 inches.
(70-68)/3.75 = 0.533
P(z>0.533)=0.297
So, the proportion of male college students whose height is greater than 70 inches of 0.297
c.Describe the distribution of , the mean of samples of size 16.
The sample average is normally distributed with mean of 68 inches and standard deviation of 3.75/sqrt(16) = 0.9375
d.Find the mean and standard error of the distribution.
The sample average is normally distributed with mean of 68 inches and standard deviation of 3.75/sqrt(16) = 0.9375
e.Find P (x-bar > 70) = 0.143 (see work below):
z=(70-68)/0.9375 = 2.133
P(z>2.133) = 0.016
f.Find P (x-bar < 67) = 0.297 (see work below):
z=(67-68)/0.9375 = -1.066
P(z<-1.066) = 0.143

Part I T/F & Multiple Choice
1.False

2.False.

3.True

4.False

5.True

6.False

7.C. 0.714

8.D.The result of one trial does not affect the probability of success on any other trial

9.B.P(z < 0)

10.C.n = 100, p = 0.05

Part II. Short Answers & Computational Questions
1. Find the following probabilities:
a. Events A and B are mutually exclusive events defined on a common sample space. If P (A) = 0.4 and P(A or B) = 0.9, find P(B).

Mutually exclusive means that P(A and B) = 0.

So P(A or B) = P(A) + P(B) - P(A and B) P(B) = 0.9-0.4 = 0.5

b. Events A and B are defined on a common sample space. If P(A) = 0.20, P(B) = 0.40, and P(A or B) = 0.56, find P(A and B)

By the same rule

P(A or B) = P(A) + P(B) P(A and B) P(B) =

0.56 = 0.2 + 0.4 - P(A and B) P(A and B) = 0.04

2.Classify the following as discrete or continuous random variables
a. continuous

b. discrete

c. discrete

d. continuous

3.A small bag of Skittles candies has the following assortment: red (10), blue (2), orange (5), brown (21), green (0), and yellow (18). Construct the probability distribution for x.

What is x? To what does it refer? The count of reds drawn? Blues? Greens?

4.Find the mean and standard deviation of the following probability distribution:
x123
P(x)0.30.50.2
Mean = 1*0.3 + 2*0.5 + 3*0.2 = 0.3 + 1 + 0.6 = 1.9

Standard deviation = sqrt of (0.3*(1-1.9)^2+0.5*(2-1.9)^2 + 0.2*(3-1.9)^2) = 0.7

5. In testing a new drug, researchers found that 5% of all patients using it will have a mild side effect. A random sample of 11 patients using the drug is selected. Find the probability that:

a) exactly two will have this mild side effect

11!/9!*2! * 0.5^2 * 0.95^9 = 0.0867

b) at least one will have this mild side effect.

1-(0.95^11) = 0.4312

6. A large shipment of TV sets is accepted upon delivery if an inspection of ten randomly selected TV sets yields no more than one defective TV.

a) Find the probability that this shipment is accepted if 5% of the total shipment is defective.

(10 * 0.05 * 0.95^9) + 0.95^10 = 0.914

b) Find the probability that this shipment is not accepted if 10% of this shipment is defective

(10 * 0.1 * 0.90^9) + 0.90^10 = 0.736

7. X has a normal distribution with a mean of 75.0 and a standard deviation of 2.5. Find the following probabilities:

a) P(x < 70.0) = 0.022750132

b) P(72.5 < x < 80.0) = 0.818594614

c) P(x >82.5) = 0.001349898

8. Find the value of z such that 40% of the distribution lies between it and the mean.

Let’s find the two values a and b such that

P(z<a) = .0.3

Since the distribution is symmetric, the other value is just the positive version of this number:

Using the table, we find that P(z<-0.524) = 0.3

So, the range (-0.524,0.524) contains the middle 40% of the distribution.

9. Assume that the average annual salary for a worker in th United States is $27500 and that the annual salaries for Americans are normally distributed with a standard deviation equal to $6250. Find the following:

a) What percentage of Americans earn below $18000?

P(x<18000) = 0.0643

b)What percentage of Americans earn above $40000?

P(x>40000) = 0.0228