BIOST/STAT 579 – Autumn 2007 Chapter 1 - 9/26/08 1/15

BIOST/STAT 579

Chapter 1: Writing Scientific Papers

The Structure of a Scientific Paper

Abstract

Introduction – include statement of purpose

Methods – include description of study design

Results – primary results only

Conclusions

Introduction

Background & Significance

Previous Research – emphasize gaps in knowledge

Purpose

Methods

Study Design

Study Procedures (how you got the data)

Measures

Statistical Analysis

Results

“Descriptives” (use a specific,more descriptive, heading than this)

“Primary Analyses” (see above)

“Secondary Analyses” (see above)

Discussion

Conclusions

Implications

Limitations

Future Research

Acknowledgements - individuals, financial support

References

Appendices - highly technical material (but only include if needed and if cited in body of paper)

Structure of a Scientific Paper – Example 1

Source: Peterson AV Jr, Kealey KA, Mann SL, Marek PM, Sarason IG.Hutchinson Smoking Prevention Project: long-term randomized trial in school-based tobacco use prevention--results on smoking. J Natl Cancer Inst. 2000 Dec 20;92(24):1979-91

Abstract

Background

Methods

Results

Conclusion

Introduction

Subjects and Methods

StudyPopulation and Sample Size

Randomized Assignment

Intervention

Implementation

Follow-up and Data Collection

Measures

Statistical Methods

Reporting the Design and Results of the Trial

Results

Baseline Comparability

Implementation Compliance

Follow-up/Data Acquisition Rates

Cotinine Validation of Self-Reported Tobacco Use

Results at Grade 12

Results at 2 Years After High School (Plus 2)

Results for a Priori-Hypothesized Subgroup Variables

Discussion

References

Notes

Structure of a Scientific Paper – Example 2

Source: Bauman KE, FosheeVA, Ennett ST, Pemberton M, Hicks KA, King TS, Koch GG. The influence of a family program on adolescent tobacco and alcohol use. Am J Public Health 2001 Apr;91(4):604-10.

Abstract

Objectives

Methods

Results

Conclusions

Introduction

Family Matters

Methods

Design, Sample, and Data Collection

Measures

Analyses

Sample Assessment

Results

Discussion

Conclusions

Contributors

Acknowledgements

References

Notes

Writing the Abstract

The abstract states the major purpose, methods and results.

Outline for an abstract (headings are optional):

Background:brief statement of significance, background, purpose or scientific question

Methods:study design, sample size, essential (eg, nonstandard) features of data collection and/or statistical analysis methods

Results: results for major scientific question(s) only

Discussion: implications, major limitations

Examples (Before class read these abstracts and decide what is good or bad about the highlighted sections)

Women's Healthy Lifestyle Project: A Randomized Clinical Trial : Results at 54 Months. Kuller LH, Simkin-Silverman LR, Wing RR, Meilahn EN, Ives DG. University of Pittsburgh, Departments of Epidemiology (L.H.K., L.R.S.-S., E.N.M., D.G.I.) and Psychiatry (R.R.W.), Pittsburgh, Pa. Circulation 2001 Jan 2;103(1):32-37.

BACKGROUND:-The Women's Healthy Lifestyle Project Clinical Trial tested the hypothesis that reducing saturated fat and cholesterol consumption and preventing weight gain by decreased caloric and fat intake and increased physical activity would prevent the rise in LDL cholesterol and weight gain in women during perimenopause to postmenopause. Methods and Results-There were 275 premenopausal women randomized into the assessment only group and 260 women into the intervention group. The mean age of participants at baseline was 47 years, and 92% of the women were white. The mean LDL cholesterol was 115 mg/dL at baseline, and mean body mass index was 25 kg/m(2). The follow-up through 54 months was excellent. By 54 months, 35% of the women had become postmenopausal. At the 54-month examination, there was a 3.5-mg/dL increase in LDL cholesterol in the intervention group and an 8.9-mg/dL increase in the assessment-only group (P:=0.009). Weight decreased 0.2 lb in the intervention and increased 5.2 lb in the assessment-only group (P:=0.000). Triglycerides and glucose also increased significantly more in the assessment-only group than in the intervention group. Waist circumference decreased 2.9 cm in the intervention compared with 0.5 cm in the assessment-only group (P:=0.000). CONCLUSIONS:-The trial was successful in reducing the rise in LDL cholesterol during perimenopause to postmenopause but could not completely eliminate the rise in LDL cholesterol. The trial was also successful in preventing the increase in weight from premenopause to perimenopause to postmenopause. The difference in LDL cholesterol between the assessment and intervention groups was most pronounced among postmenopausal women and occurred among hormone users and nonusers.

Example 2

A prospective study of physical activity and risk of prostate cancer in US physicians. Liu S, Lee IM, Linson P, Ajani U, Buring JE, Hennekens CH. Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02215, USA. Int J Epidemiol 2000 Feb;29(1):29-35

BACKGROUND: Exercise can suppress androgen production and may thus decrease the risk of prostate cancer. However, findings from epidemiological studies assessing physical activity and risk of prostate cancer are inconsistent. METHODS: We prospectively examined the association between physical activity and prostate cancer risk in the Physicians' Health Study (PHS), a randomized trial of low-dose aspirin and beta-carotene among 22,071 men aged 40-84 without self-reported myocardial infarction, stroke and cancer. At baseline in 1982, men were asked about the frequency of exercise vigorous enough to work up a sweat. Physical activity was assessed in a similar fashion again at 36 months of follow-up. RESULTS: During 11.1 years of follow-up (258 779 person-years), 982 cases of prostate cancer occurred and were confirmed by medical record review.After adjustment for potential confounding factors (including age, height, randomized treatment assignment, smoking status, alcohol intake, use of multivitamins, history of diabetes, history of hypertension and history of high cholesterol), the relative risks for prostate cancer associated with exercise vigorous enough to work up a sweat were 1.0 (referent) for frequency less than once per week, 1.02 (95% CI: 0.82-1.26) for once per week, 1.07 (95% CI: 0.90-1.27) for 2-4 times per week, and 1.11 (95% CI: 0.90-1.36) for 5+ times per week. Across all subgroups of men categorized by age, body mass index, smoking status, alcohol intake, use of multivitamins, history of diabetes, history of hypertension and history of high cholesterol, there were no significant associations between frequency of exercise vigorous enough to work up a sweat and prostate cancer risk. After excluding cases of prostate cancer that occurred during the first 36 months of follow-up, again, there was no significant association. Combining physical activity assessments at baseline and at 36 months also yielded no significant association with prostate cancer risk. CONCLUSIONS: These observational data from the Physicians' Health Study do not support the hypothesis that increased physical activity reduces the risk of prostate cancer.

Example 3:
Prevention of stroke in urban China: a community-based intervention trial. Fang XH, Kronmal RA, Li SC, Longstreth WT Jr, Cheng XM, Wang WZ, Wu S, Du XL, Siscovick D. Department of Neuroepidemiolology, Beijing Neurosurgical Institute, Beijing, People's Republic of China. Stroke 1999 Mar;30(3):495-501.

BACKGROUND AND PURPOSE: Stroke has been the second leading cause of death in large cities in China since the 1980s. Meanwhile, the prevalences of hypertension and smoking have steadily increased over the last 2 decades. Therefore, a community-based intervention trial was initiated in 7 Chinese cities in 1987. The overall goal of the study was to evaluate the effectiveness of an intervention aimed at reducing multiple risk factors for stroke. The primary study objective was to reduce the incidence of stroke by 25% over 3.5 years of intervention. METHODS: In May 1987 in each of 7 the cities, 2 geographically separated communities with a registered population of about 10 000 each were selected as either intervention or control communities. In each community, a cohort containing about 2700 subjects (>/=35 years old) free of stroke was sampled, and a survey was administered to obtain baseline data and screen the eligible subjects for intervention. In each city, a program of treatment for hypertension, heart disease, and diabetes was instituted in the intervention cohort (n approximately 2700) and health education was provided to the full intervention community (n approximately 10 000). A follow-up survey was conducted in 1990. Comparisons of intervention and control cohorts in each city were pooled to yield a single summary. RESULTS: A total of 18 786 subjects were recruited to the intervention cohort and 18 876 to the control cohort from 7 cities. After 3.5 years, 174 new stroke cases had occurred in the intervention cohort and 253 in the control cohort. The 3.5-year cumulative incidence of total stroke was significantly lower in the intervention cohort than the control cohort (0.93% versus 1.34%; RR=0.69; 95% CI, 0.57 to 0.84). The incidence rates of nonfatal and fatal stroke, as well as ischemic and hemorrhagic stroke, were significantly lower in the intervention cohort than the control cohort. The prevalence of hypertension increased by 4.3% in the intervention cohort and by 7.8% in the control cohort. The average systolic and diastolic blood pressures increased more in the control cohort than in the intervention cohort. Among hypertensive individuals in the intervention cohort, awareness of hypertension increased by 6.7% and the percentage of hypertensives who regularly took antihypertensive medication increased 13.2%. All of these indices became worse in the control cohort. The prevalence of heart diseases and diabetes increased significantly in the both cohorts (P<0.01). The prevalence of consumption of alcohol increased slightly, and that of smoking remained constant in both cohorts. CONCLUSIONS: A community-based intervention for stroke reduction is feasible and effective in the cities of China. The reduction, due to the intervention, in the incidence of stroke in the intervention cohort was statistically significant after 3.5 years of intervention. The sharp reduction in the incidence of stroke may be due to the interventions having blunted the expected increase in hypertension that accompanies aging as well as to better and earlier treatment of hypertension, particularly borderline hypertension. Applied health education to all the residents of the community may have prevented some normotensive individuals from developing hypertension and improved overall health awareness and knowledge.

Writing the Introduction

  1. Start with significance of general problem or subject area, e.g., “smoking kills more than 400,000 people in the U.S. each year.”
  1. Focus the discussion on the specific subject to be addressed in the paper, i.e., don't review the literature in the entire field.
  1. Review the previous research on the scientific question of interest, focusing on (i) what is known, (ii) how well it is known, (iii) limitations and challenges of the research, (iv) theories that guide the research, and (v) identify the gap in the previous research that this study addresses.
  1. End with statement of purpose or hypothesis.

Example

Source: Peterson et al.,J Natl Cancer Inst. 2000 Dec 20;92(24):1979-91

“Cigarette smoking remains the number one cause of preventable, premature death…

Since the early 1980s, the National Cancer Institute (NCI),… has sponsored an extensive program of research… A major focus of this research has been school-based smoking prevention. …

… no long-term intervention impact has been observed to date. In addition, because of the challenges inherent in the school setting and in the youth populations themselves, school-based trials have suffered various methodological difficulties. …

The HSPP was the first randomized, controlled trial of smoking prevention among youth to … In this article, we present the HSPP trial’s results…"

Writing the Methods

Study Design:type of study design, study population, sample size (with justification).

Procedures: sampling, randomization, data collection, tracking and follow-up.

Measures: definitions of all variables used in the analyses, reliability of measurement, with units (continuous vars) or coding (categorical variables).

Statistical Analysis: description of analysis plan with enough detail to allow a competent statistician to duplicate the analysis (see key aspects below-use of the headings is optional).

Descriptive statistics: types of descriptives used, with computational method for any non-standard statistics.

Coding of categorical variables: were categorical variables dichotomized (and how), treated as ordinal, or as continuous? Give reasons for choices.

Primary analysis: statistical method, model used (if any), covariates controlled (with justification), effect size measure (odds-ratio, etc.), method used to compute SEs, CIsand/or hypothesis tests, level of statistical significance used (e.g., .05), adjustment for multiple testing (if relevant), assumptions made.

Secondary analyses: same as for primary analysis, but less detail needed.

Writing the Results

Descriptive statistics: report only those directly relevant to the scientific questions, eg, assessment of potential confounding, missing data, descriptive analysis of treatment effect of interest.

Primary analysis: don’t linger over descriptives or unimportant results—get to the primary results quickly! Use an easily interpretable effect size measure (usually not a regression coefficient). Give a confidence interval for your effect size measure.

Secondary analysis: don’t need as much detail as primary analysis results

Note on model diagnostics: even if needed, usually don’t need to report (other than to say you did them) unless to illustrate unusual feature of data that affects interpretation of results.

Guidelines for Tables & Figures

  • use to draw attention to most important results or to present complicated information (eg, to compare results in various sub-groups)
  • tables are better for a small set of results, graphs better for more complicated results—but be sure the extra detail is really necessary to report at all
  • refer the reader to the table/figure—if it’s not used then delete it—and point out
  • what information it contains, and
  • what message it conveys
  • tables and figures are best put soon after they are used in the text or all at the end of the manuscript
  • figure captions go at bottom, table captions go at top (just a convention)

Tables & Figures – Examples: the Good and the Bad*

Table 1 Baseline characteristics according to frequency of vigorous exercise, Physicians' Health Study.

Frequency of vigorous exercise (times per week)
Characteristics / <1 N = 6048 / 1 N = 4015 / 2–4 N = 8188 / 5+ N = 3554
Mean age (years) / 54 ± 9 / 53 ± 9 / 53 ± 9 / 53 ± 9
Mean body mass index (kg/m2) / 25.4 ± 3.3 / 25.2 ± 3.0 / 24.8 ± 2.9 / 24.2 ± 2.7
Mean height (in) / 69.7 ± 7.9 / 70.1 ± 7.5 / 70.2 ± 7.6 / 70.2 ± 7.3
Cigarette smoking (%)
Never / 47 / 50 / 51 / 51
Past / 39 / 37 / 40 / 42
Current, <20 per day / 4 / 4 / 4 / 3
Current, 20+ per day / 10 / 9 / 5 / 4
Alcohol consumption (%)
Rarely / 17 / 13 / 13 / 17
Monthly / 14 / 12 / 10 / 10
Weekly / 45 / 51 / 52 / 48
Daily / 24 / 25 / 25 / 26
Current use of multivitamin supplement (%) / 18 / 17 / 21 / 25
History of diabetes (%) / 4 / 2 / 2 / 2
History of hypertension (%) / 15 / 14 / 13 / 13
History of high cholesterol (%) / 8 / 7 / 7 / 6
Randomized to:
Aspirin (%) / 49 / 50 / 51 / 49
Beta-carotene (%) / 50 / 50 / 50 / 51

Source: A prospective study of physical activity and risk of prostate cancer. S. Liu et al. International Journal of Epidemiology 2000;29:29-35.

* Before class, decide what is good and what is bad about these figures and tables.

Fig. 4. (a) Effect of NMDA agonist (-cycloserine) on the development of acute tolerance to ethanol. Two groups received either cycloserine (¯¯ 30 mg/kg ip) or saline (----) 30 min prior to E (3 g/kg ip). E-induced motor impairment was assessed at 30, 45, 60, 75, 90, 105 and 120 min. BAC was measured immediately after each assessment. Acute tolerance development to ethanol was expressed as in Fig. 1. Results shown are mean ±S.E.M. (n=10 animals per group). (b) Effect of 5-HT depletion (p-CPA) on the development of acute tolerance to E. Live groups were pretreated with p-CPA (¯¯ 100 mg/kg daily) or water (----) by gavage for 5 days. On the test day, E-induced motor impairment was assessed at 30, 45, 60, 75, 90, 105 and 120 min. BAC measurement was taken immediately after each assessment. Development of acute tolerance to E was expressed as in Fig. 1. Results shown are mean±S.E.M. (n=7¯8 animals per group)

Source: Pharmacology Biochemistry and Behavior Volume 72, Issues 1-2 May 2002 Pages 291-298

Source: Peterson AV Jr, Kealey KA, Mann SL, Marek PM, Sarason IG.Hutchinson Smoking Prevention Project: long-term randomized trial in school-based tobacco use prevention--results on smoking. J Natl Cancer Inst. 2000 Dec 20;92(24):1979-91

Source: Peterson AV Jr, Kealey KA, Mann SL, Marek PM, Sarason IG.Hutchinson Smoking Prevention Project: long-term randomized trial in school-based tobacco use prevention--results on smoking. J Natl Cancer Inst. 2000 Dec 20;92(24):1979-91

Source: The influence of a family program on adolescent tobacco and alcohol use
American Journal of Public Health; Washington; Apr 2001; Volume 91, Issue 4, 604-610. Karl E Bauman;Vangie A Foshee;Susan T Ennett;Michael Pemberton;et al;

Writing the Discussion

The discussion gives the “big picture” by

- summarizing the key points of the significance, study design, methods, results;

-stating the conclusions and their implications for researchers in the field, and for practitioners (if appropriate);

-stating the confidence in the conclusions based on the study’s critical strengths and weaknesses.

Writing Tips

  1. Be Clear and Concise

* When in doubt – cut

* Get to the point – most important results early then secondary results later.

* Start at the end and work backward - helps keep focus.

* Reduce your fog-factor (average number of 3-syllable words per sentence).

  1. Think of your reader

* What does he/she need to get out of your report?

* Use signposting (headings, subheadings, transitions, outlines, figures,

tables) to guide your reader through your work.

* When reading others’ work, pay attention to what works and what doesn’t.

  1. Use strong subjects and verbs

* Use concrete nouns rather than pronouns, eg, “Blood pressure tends to rise

with age” instead of “It has been noted that blood pressure … “

* Use active verbs, eg, “Mercury concentrations increased over time” not

“There was an increase in mercury concentrations…”

  1. Know how to use punctuation effectively.
  1. Be prepared to revise many times, and expect that writing the report will take at least as long as the analysis.

Useful References:

  1. Ehrenberg ASC: Writing technical papers or reports. American Statistician 36, 326-9, 1982.
  2. Ehrenberg ASC: The problem of numeracy. American Statistician 35:67-71, 1981.
  3. Gopen GD: The science of scientific writing. American Scientist 78:550-8, 1990.
  4. Strunk, W. and White, E.B. The Elements of Style. New York: NY: Macmillan, 1972.

OTHER USEFUL SOURCES: Instructions to authors in scientific journals