HYPOTHESIS TESTING

Parameters are concerned

The statistical test is based:

  • Parameters stemming from sample data,
  • theoric statististical distribution – based on the hypothesis to be tested

The Null hypothesis is an assumption concerning one or more parameter characterising a phenomenon:

Alternative Hypothesis is an assumption opposed to the null hypothesis.

K is a subset of the sampling space sustaining H0,

W (con KW=) is a subset of the sampling space sustaining H0 not true.

(sampling space gives the range of all possible values of a r.v. generated by sampling)

Given a sample and a sampling the question is :

How likely is this sampling parameter if H0 is true:

If the probability is to low we are prone to reject H0

Considering the existence of an alternative hypothesis is possible to make errors rejecting or not rejecting H0.

The probability of these errors is small: exactly the same given by the rejecting area under H0 (, significance level).

In synthesis:

  • choice of H0
  • choice of the test statistic T=t(X1, X2, …Xn) – given H0 its distribution probability is Known
  • looking at T values selecting a region of values that it is possible to consider quite close to the parameter of interest under H0 with high probability
  • comparing the observed T in the sample (tobs= t(x1, x2, …,xn) ) and taking a decision about the region where it falls
  • defining a t threshold and, as consequence, a region for rejecting the Hypotesis:

a)one-tail test:H0: =0

H1: 0

(the T values implying a refusal are supposed to be quite far from 0 and higher than 0 - depending on the distribution variability :

P( T t | H0) = reject region : [t, +)

b)one-tail test:H0: =0

H1: 0

(the T values implying a refusal are supposed to be quite far from 0 and lower than 0 - depending on the distribution variability :

P( T - t | H0) = reject region: (-, -t]

c)two tails test:H0: =0

H1: 0

(the T values implying a refusal are supposed to be quite far from 0 and higher or lower than 0 - depending on the distribution variability :

P( T t/2 | H0) = P( T t/2 | H0) = /2

reject region: (-, -t/2] e [t/2, +)

In synthesis:

Let T in the sample (toss) known:

a) One tail test :if toss t reject H0

b) One tail test :if toss - t reject H0

c) two tails test:if toss - t/2 or toss t/2 ( |toss | t/2) rifiuto H0

(p-value) r significance level

pobs= P(|T| |tobs| ; | H0)

one tail test :if pobsreject H0

I type error

When is true and we reject it.

II type error

When is not true and we do not reject

Necessary to make balance between two kind of errors

test on the mean when the variance is known

test Statistic:

We need to standardise (easy to calculate probability) taking into account:

=

Z = ~ N (0,1)

zobs is:

zobs =

pobs= P(Z zobs | H0)

We reject when:

1)alternative hyipothesis (one tail)

given , looking at z* - on the table of N (0,1) – corresponding to:

P(Z z*| ) = 

If zobs z*reject

If zobs < z*no reject

or

if pobsreject

Se pobsno reject

Note

if reject

( far from - that is - , reject )

2)alternative Hypothesis(one tail)

given , looking at z* - on the table of N (0,1) – corresponding to:

P(Z z*| ) = 

Owing to the symmetry of N (0,1):

P(Z z*| ) = P(Z - z*| ) =

If zobs - z* reject

If zobs - z* no reject

3)alternative Hypothesis(two tails)

Given  - to be fairly distributed on the two tails

Looking at the tables N (0,1) for the z*/2 that:

P(Z z*/2 | ) = /2 corresponding to P(Z - z*/2 | ) = /2

If zobs z*/2or if zobs - z*/2reject

If - z*/2 zobs < z*/2No reject

The pobs,:

pobs= P(|Z| |zoss |; | H0) = 2 P(Z |zoss |; | H0)

Then

If pobsreject

If pobs NO reject

note

zoBs z* or IF zobs - z*/2  reject "

correspond to the statement:

If NO reject

(formal statement equivalent to the informal one: “when is far from - that is from - reject)

Testing The Difference Between Two Means – Variances Are Known

Let be a sample selected from a population .

Let be a sample selected from a population .

We want to test:

vs

Knowing [1]

Then:

If or

If

Then H0 Is rejected

if

Then H0 Is not rejected

testing difference between two means - variances are unknown

If and are not known but t is possible to assume = , then a common variance can be estimated using:

No rejecting region is built considering

, [2]

If or

if

then H0 is rejected

If

Then H0 Is No rejected

Esercizio 6

From a census survey we know tha 70% of Households makes shopping in big stores.

After 3 years we take a surveywith a sample of 600 HHs and we find that 406 make shopping in big stores. Do we have enough evidence to say that HHs has the same behaviour of the year of the census ? (Choose an high Confidence)

Solution

-The sample is large so a normal distribution can be assumed for the sampling proportion

-The null Hypothesis is , alternative Hypothesis is .

-From the sampling data we have .

-Under the null hypothesi the sampling proportion has a normal distribution with mean e variance ,

-We can refer to the test Z , in the sample .

-The rejecting (unlikely) area is ,

--1,07 is in the likely region >-2,576 then the null hypotesis is not rejected.