EBM Z master: Eric Aubel
Dr. WarnerDegrees of Freedom: Justin Reid
2/27/011:00
We interrupt your regularly scheduled hour of Case Control Studies to bring you…
Homework Hour
Yep, that’s right folks, Dr. Warner spent most of the hour showing us how to work the homework from last week. So, if you had problems with that homework, or can’t get enough of those “crazy” Z tables, then this scribe is for you.
Exercise 5.1
In one group of 62 patients with iron deficiency anemia the hemoglobin level was 12.2 g/dl, standard deviation 1.8 g/dl. In another group of 35 patients it was 10.9 g/dl, standard deviation 2.1 g/dl.
What is the null hypothesis here? It would be that there is no difference. When testing the null hypothesis, we want to look at one population, which is a population of differences, to see if that population of differences is zero.
What is the standard error of the difference between the two means?
-He said most people didn’t have a problem with this, but I will work it all out for completeness.
SE diff. = = .42 g/dl
What is the difference? (of the means)12.2 – 10.9 = 1.3
How do you go from here to a Z or t value? (which is the spot on the normal distribution curve so you can make a decision) – you use the wacky Z formula.
Z value = difference in means / SE diff.
Z = 1.3 / .42 = 3.095 (he says 3.08)
Why did we use Z, instead of t? Because we are dealing with relatively large numbers of samples. For our purposes, samples 31 and larger = Z 0 – 30 samples = t
What is the significance of the difference?
-Since Z = 3.095, we take that value and find its associated probability using the Z table (on the web site) We find that our Z value falls in between the values of 3.0 and 3.291, with their associated probabilities as .0027 and .0010 respectfully. Therefore we write the probability as,
.0010p.0027
-This means that the probability that this finding happened by chance is somewhere between .0010 and .0027.
Give an approximate 95% confidence interval for the difference.. 48 to 2.12g/dl
1.3 + (1.96)(.42) = 2.12
1.3 – (1.96)(.42) = .48
Exercise 7.1
In 22 patients with an unusual liver disease the plasma alkaline phosphatase was found by a certain laboratory to have a mean value of 39 King-Armstrong units, standard deviation 3.4 units.
What is the 95% confidence interval within which the mean of the population of such cases whose specimens come to the same laboratory may be expected to lie?
-There are only 22 patients, so we’re going to use the t table. (on web site) To find our t value we calculate the degrees of freedom (df) as 22 – 1 = 21 and since we want a 95% CI we use p= .05 giving us a t value of 2.08
SE = 3.4/22 = .72595% CI = 39 ± (2.08)(.725)
= 37.5 to 40.5
What is this a confidence interval of? The real mean value of all similar patient specimens coming to that laboratory. (Basically, we don’t know the true mean of the entire population, but we are 95% confident, assuming the data was collected correctly, that the true mean lies in our range.) It is NOT a 95% reference range of all values.
Quote of the day: “Garbage in, garbage out.” Basically meaning that we should appreciate that we could plug in all sorts of incorrect numbers into these formulas and get answers. (Not necessarily precise or accurate answers though.)
On the exam – He expects us to know the difference between the Z and t tables. 30 is the magic number.
31 and larger = Z table30 or less = t table
Exercise 7.4
A new treatment for varicose ulcer is compared with a standard treatment on ten matched pairs of patients, where treatment between pairs is decided using random numbers. The outcome is the number of days from start of treatment to healing of ulcer. One doctor is responsible for treatment and a second doctor assesses healing without knowing which treatment each patient had. The following treatment times were recorded.
Standard treatment: 35, 104, 27, 53, 72, 64, 97, 121, 86, 41 days;
New treatment: 27, 52, 46, 33, 37, 82, 51, 92, 68, 62 days.
A huge clue in how to start this is the fact that the data is in matched pairs. Meaning that we are going to be looking at one population of the differences. What would be the null hypothesis? It would be that there is no difference in the treatments, therefore the difference should be zero. Our alternate hypothesis would be that there is a difference in treatment, and zero would not be included in our CI.
What is the mean difference in the healing time?
-The differences in samples are as follows: 8+52+(-19)+20+35+(-18)+46+29+18+(-21) = 150
150 / 10 = 15 days
-This would be a two-tailed test. We are looking for a difference one way or another.
SD = 26.96SE = 26.96 / 10 = 8.53
The value of t?t = mean / SE = 15 / 8.53 = 1.758
-Remember, we used t because the sample size was 30 or less.
The number of degrees of freedom?df = 10 – 1 = 9
The probability that the difference occurred by chance?
-Looking at the t table with df = 9, our t value of 1.758 puts us between 1.833 and .703, with probabilities of .1 and .5 respectively. Therefore, .1p.5
What is the 95% confidence interval for the difference?
-95% CI = 15 ± (2.262)(8.53) = - 4.3 to 34.3 days
Was there a difference? NO, because zero is included in the confidence interval. Therefore, we cannot reject the null hypothesis because at a 95% CI we cannot be confident that the true difference is not zero.
In this instance, we were interested in absolute differences because it was a two-tailed test. There was no indication that one treatment was better than the other, so we’re looking for differences on either side of the normal distribution curve. There are instances where a one tailed test would be used. For example, when testing differences in antibody levels of people with infection, you would think that the people with infection would have higher antibody levels than your controls. Therefore, you would use a one tailed test, because you wouldn’t want to waste the power of your test to see if cases had fewer antibodies than controls.
We now return you to your regularly scheduled lecture, already in progress…
Hierarchy of epidemiological types of studies
Correlation – looks at population numbers of two factors and see if they correlate
Case Series – (ex. Article about SIDS babies)
In both of these studies, you can’t test a hypothesis. Correlation studies have no information on individuals, and case series studies have no comparison group.
Observational: Case Control Studies
Cohort studies
Experimental – you as a clinician have some say as to who is exposed. (Randomized Clinical Trials)
Case control, cohort, and experimental studies allow testing of hypotheses.
This lecture will focus on Case Control studies.
Outbreaks are almost exclusively case control studies
In case control, you begin with a certain amount of disease (or cases affected).
There must be a good case definition, because your case definition is a test.
-It can be sensitive and/or specific. EX.) If you were investigating a food borne illness and your case definition was diarrhea, the test would be so sensitive that you would get background noise. Have to find a happy medium.
-The case definition should include some constellation of symptoms and a time element.
-If you include a risk factor in your definition, you are building in an association. (not good)
The next step is to find comparable controls.
-3 typical places for controls:
- Hospital based – classic
- Occupational based
- Population based – *this is the best one*
-Typically want to have 1 – 3 times as many controls versus your cases for a solid study.