Dr. Nafez M. Barakat

Lecture no.1

Descriptive measures

·  Measure of center

o  Mean

Definition: the mean of the data set is the sum of the observations divided by the number of observation symbolized by

Example1 : ( exam scores) let x1 = 88, x2 = 75, x3=95, and x4=100

Compute the mean

Solution :

o  Median

Definition: median of a data set

Arrange the data in increasing order.

-if the number of observation is odd then the median is the observation exactly in the middle of the ordered lest

- if the number of observation is even then the median is the mean of the two middle observations in the ordered list.

Example2 : weekly salaries

Find the median for the tow weekly salaries data

Table (1.1)

Data set I( n=13) odd / Data set II( n= 10)even
300 / 300 / 300 / 940 / 300 / 300 / 300 / 940 / 450 / 400
300 / 400 / 300 / 400 / 450 / 400 / 300 / 300 / 1050 / 300
800 / 450 / 1050

SOLUTION : Median for data set (I) = 400$, and for data set (II) = 350$

o  Mode

Definition: the mode is the value that occurs most frequently in a data set

Example : find the mode in data set (I) in table 1.1

Solution: the frequency distribution of the data shown in table 1.2

Table 1.2

Salary / 300 / 400 / 450 / 800 / 940 / 1050
Frequency / 6 / 2 / 2 / 1 / 1 / 1

NOTE:

Median< mean Median =mean mean Median

Right-skewed symmetric left-skewed

·  Measure of variation (measure of spread)

o  Range

Definition: the range of the data set is the difference between its maximum and minimum observations : Range = Max - Min

o  Standard deviation

Definition : Standard deviation equal to the square root of the arithmetic mean of the squares of the deviations from the arithmetic mean denoted by S.

or

o  Variance

Definition : the variance equal the square of Standard deviation

Example : the height, in inches of five players on team II are 67, 72, 76, 76 and 84. Obtain the Standard deviation Of these height

Solution :

X / /
67 / -8 / 64
72 / -3 / 9
76 / 1 / 1
76 / 1 / 1
84 / 9 / 81
156

inches

Try to use the formula to commute Standard deviation

o  Inter quartile range

Definition : inter quartile range or ( IQR), is the difference between the first and third quartiles, that is Standard deviation

IQR = Q3 – Q1

Example : find the IQR fore these data

25 / 41 / 27 / 32 / 43 / 66 / 35 / 31 / 15 / 5
34 / 26 / 32 / 38 / 16 / 30 / 38 / 30 / 20 / 21

Solution : Q1 = 23 , Q3 = 36.5

IQR = 36.5- - 23 = 13.5

Hypothesis Test for One Population Mean

One sample t test for population Mean

Definition : The One-Sample T Test compares the mean score of a sample to a known value. Usually, the known value is a population mean.

Definition : Null hypotheses and Alternative hypothesis

Null hypotheses : a hypothesis to be tested, We use the symbol H0 to represent the null hypothesis.

Alternative hypothesis: a hypothesis to be conceder as alternative to null hypothesis, We use the symbol Ha to represent the alternative hypothesis.

Hypotheses:
Null: There is no significant difference between the sample mean and the population mean.
Alternate: There is a significant difference between the sample mean and the population mean

We present two step by step procedure for performing a one sample t-test. Procedure (I) covers the critical-value approach, and Procedure (II) covers the p-value approach.

·  One sample t test for population Mean

(critical-value approach)

Assumptions

1.  Normal population or large sample

2.  unknown

Step 1: the null hypothesis is and the alternative hypothesis is

Step 2 : decide on the significance level,

Step 3: compute the value of the test statistic

Step 4: the critical value (s) are

or

or

with degrees of freedom (df= n-1)

Step 5 : if the value of the t test statistics falls in the rejection region, reject HO ; otherwise, fail to reject H0

Step 6 : interpret the results of the hypothesis test.

Example : table below show the pH levels for 15 lakes; test if the lakes has pH greater than 6 at 5% significant level.( use the critical value approach)

7.2 / 7.3 / 6.1 / 6.9 / 6.6 / 7.3 / 6.3 / 5.5
6.3 / 6.5 / 5.7 / 6.9 / 6.7 / 7.9 / 5.8

Solution :

Step 1: state the null and alternative hypotheses

( mean PH Level is not greater than 6)

(mean PH Level is greater than 6)

Step 2 : decide on the significance level,

Step 3: compute the value of the test statistic

Step 4: the critical value for a right-tailed test is (from table) with df = 15-1 = 14

Step 5: the value of the test statistic, found in step 3 is T=3.458 fail in the rejection region. Consequently , we reject HO

·  One sample t test for population Mean

(P-Value Approach)

Assumptions

3.  Normal population or large sample

4.  unknown

Step 1: the null hypothesis is and the alternative hypothesis is

Step 2 : decide on the significance level,

Step 3: compute the value of the test statistic

Step 4: find the p-value by using table

with degrees of freedom (df= n-1)

Step 5 : if the P- value less than or equal , (), reject HO ; otherwise, fail to reject H0

Step 6 : interpret the results of the hypothesis test.

Example : table below show the pH levels for 15 lakes; test if the lakes has pH greater than 6 at 5% significant level. ( use the p-value Approach)

Solution :

Step 1: state the null and alternative hypotheses

( mean PH Level is not greater than 6)

(mean PH Level is greater than 6)

Step 2 : decide on the significance level,

Step 3: compute the value of the test statistic

Step 4: the p-value = p ( t>= 3.458) = 0.00192 (with df = 15-1 = 14 )

Step 5: p value < 0.05) so we reject HO

Interval Estimation

Interval Estimation of a Population Mean: with s Unknown

·  Interval Estimate

where 1 -a = the confidence coefficient

ta/2 = the t value providing an area of a/2 in the upper tail of a t distribution

with n - 1 degrees of freedom

s = the sample standard deviation

n = sample size

example :

suppose that we have a sample employees salary with the following information : n = 10, mean = $550, standard deviation = $60, we want to estimate a 95% confidence interval of the mean, assume this population to be normally distributed:

solution :

At 95% confidence, 1 - a = .95, a = .05, and a/2 = .025.

t.025 is based on n - 1 = 10 - 1 = 9 degrees of freedom.

In the t distribution table we see that t.025 = 2.262

Interval Estimation of a Population Mean:

= 550 + 42.92

or $507.08 to $592.92

We are 95% confident that the mean salary of the population is between $507.08 and $592.92.

use SPSS program

example 1: use the SPSS program to perform the hypothesis in previous example

STEP 1: Enter The Data As Shown Below

Step 3 : the result shown below

example 2: use spa file called training to test if the mean of training time equal 60 days, also find 95% confidence interval for the mean population

solution :

Step 1: state the null and alternative hypotheses

( mean training equal 60 days)

(mean training not equal 60 days)

Step 2 : decide on the significance level,

Step 3: compute the value of the test statistic,

from output t = -3.482

Step 4: the p-value = 2*p ( t>= 3.482) = 0.004 (with df = 15-1 = 14 )

Step 5: the value of the test statistic, found in step 3 is T=-3.482 fail in the rejection region (-2.14, 2.14). Consequently , we reject HO

or the p-value =0.004 < 0.05 so we reject HO

SPSS output :

95% confidence interval for the mean population

SPSS OUTPUT

95% Confidence Interval for Mean / Lower Bound / 50.09
Upper Bound / 57.65

=[50.09, 57.65]

NOTE that the mean test = 60 not include in the C.I so we reject null hypotheses

NONPARAMETRIC TEST

Use Sign Test (Binomial Test)

24