HYPOTHESIS TESTING
Parameters are concerned
The statistical test is based:
- Parameters stemming from sample data,
- theoric statististical distribution – based on the hypothesis to be tested
The Null hypothesis is an assumption concerning one or more parameter characterising a phenomenon:
Alternative Hypothesis is an assumption opposed to the null hypothesis.
K is a subset of the sampling space sustaining H0,
W (con KW=) is a subset of the sampling space sustaining H0 not true.
(sampling space gives the range of all possible values of a r.v. generated by sampling)
Given a sample and a sampling the question is :
How likely is this sampling parameter if H0 is true:
If the probability is to low we are prone to reject H0
Considering the existence of an alternative hypothesis is possible to make errors rejecting or not rejecting H0.
The probability of these errors is small: exactly the same given by the rejecting area under H0 (, significance level).
In synthesis:
- choice of H0
- choice of the test statistic T=t(X1, X2, …Xn) – given H0 its distribution probability is Known
- looking at T values selecting a region of values that it is possible to consider quite close to the parameter of interest under H0 with high probability
- comparing the observed T in the sample (tobs= t(x1, x2, …,xn) ) and taking a decision about the region where it falls
- defining a t threshold and, as consequence, a region for rejecting the Hypotesis:
a)one-tail test:H0: =0
H1: 0
(the T values implying a refusal are supposed to be quite far from 0 and higher than 0 - depending on the distribution variability :
P( T t | H0) = reject region : [t, +)
b)one-tail test:H0: =0
H1: 0
(the T values implying a refusal are supposed to be quite far from 0 and lower than 0 - depending on the distribution variability :
P( T - t | H0) = reject region: (-, -t]
c)two tails test:H0: =0
H1: 0
(the T values implying a refusal are supposed to be quite far from 0 and higher or lower than 0 - depending on the distribution variability :
P( T t/2 | H0) = P( T t/2 | H0) = /2
reject region: (-, -t/2] e [t/2, +)
In synthesis:
Let T in the sample (toss) known:
a) One tail test :if toss t reject H0
b) One tail test :if toss - t reject H0
c) two tails test:if toss - t/2 or toss t/2 ( |toss | t/2) rifiuto H0
(p-value) r significance level
pobs= P(|T| |tobs| ; | H0)
one tail test :if pobsreject H0
I type error
When is true and we reject it.
II type error
When is not true and we do not reject
Necessary to make balance between two kind of errors
test on the mean when the variance is known
test Statistic:
We need to standardise (easy to calculate probability) taking into account:
=
Z = ~ N (0,1)
zobs is:
zobs =
pobs= P(Z zobs | H0)
We reject when:
1)alternative hyipothesis (one tail)
given , looking at z* - on the table of N (0,1) – corresponding to:
P(Z z*| ) =
If zobs z*reject
If zobs < z*no reject
or
if pobsreject
Se pobsno reject
Note
if reject
( far from - that is - , reject )
2)alternative Hypothesis(one tail)
given , looking at z* - on the table of N (0,1) – corresponding to:
P(Z z*| ) =
Owing to the symmetry of N (0,1):
P(Z z*| ) = P(Z - z*| ) =
If zobs - z* reject
If zobs - z* no reject
3)alternative Hypothesis(two tails)
Given - to be fairly distributed on the two tails –
Looking at the tables N (0,1) for the z*/2 that:
P(Z z*/2 | ) = /2 corresponding to P(Z - z*/2 | ) = /2
If zobs z*/2or if zobs - z*/2reject
If - z*/2 zobs < z*/2No reject
The pobs,:
pobs= P(|Z| |zoss |; | H0) = 2 P(Z |zoss |; | H0)
Then
If pobsreject
If pobs NO reject
note
zoBs z* or IF zobs - z*/2 reject "
correspond to the statement:
If NO reject
(formal statement equivalent to the informal one: “when is far from - that is from - reject)
Testing The Difference Between Two Means – Variances Are Known
Let be a sample selected from a population .
Let be a sample selected from a population .
We want to test:
vs
Knowing [1]
Then:
If or
If
Then H0 Is rejected
if
Then H0 Is not rejected
testing difference between two means - variances are unknown
If and are not known but t is possible to assume = , then a common variance can be estimated using:
No rejecting region is built considering
, [2]
If or
if
then H0 is rejected
If
Then H0 Is No rejected
Esercizio 6
From a census survey we know tha 70% of Households makes shopping in big stores.
After 3 years we take a surveywith a sample of 600 HHs and we find that 406 make shopping in big stores. Do we have enough evidence to say that HHs has the same behaviour of the year of the census ? (Choose an high Confidence)
Solution
-The sample is large so a normal distribution can be assumed for the sampling proportion
-The null Hypothesis is , alternative Hypothesis is .
-From the sampling data we have .
-Under the null hypothesi the sampling proportion has a normal distribution with mean e variance ,
-We can refer to the test Z , in the sample .
-The rejecting (unlikely) area is ,
--1,07 is in the likely region >-2,576 then the null hypotesis is not rejected.