Spirometry Reference Values: a systematic review of published articles from 1998 to 2008

LEITE, Ana; PEIXOTO, Cláudia; MOURA, Diana; MARTINS, Diana; FERNANDES, Luís; ALMEIDA, Maria; BRITO, Nuno; ALMEIDA, Pedro; DIOGO, Pedro; MONTEIRO, Sara; PIMENTA, Sofia. E-mail:

Adviser: Tiago Jacinto, MD; Class 1

ABSTRACT

Background: Reference values are valuable diagnostic tools, which by comparison with a patient's personal value can indicate whether the individual is healthy or not. In spirometry, FVC and FEV1 are commonly used values that have their own reference ranges. Several authors indicate that these spirometric reference values are unadequate or not updated.

Aims: To extract relevant data and evaluate the methodological quality of the studies about spirometric reference values published between 1998 and 2008 using a modified version of the STROBE checklist that we conceived.

Data sources: In this systematic review, articles published from 1998 to 2008 that had "Spirometry" and "Reference Values" as keywords were searched in Pubmed, ISI Web of Knowledge and SCOPUS.

Study selection: The 830 non-duplicate articles were filtered by the analysis of their titles abstracts. This procedure required 11 reviewers who independently applied the eligibility criteria to different groups of articles, so that each article was reviewed twice by different reviewers. The title selection excluded 768 articles and the abstract phase eliminated 45 articles, both based on previously defined criteria. From the remaining 17 articles, 9 were obtained and analyzed.

Data extraction and quality assessment: Relevant information about the 9 study’s design was extracted. Each article was initially scored by one reviewer, using an adapted version of the STROBE checklist, and a second time by a different reviewer. In case of disagreement on any topic, a third reviewer decided.

Results: In average, the study participants’ ages ranged from 26 to 71. The median number of participants is 627: 326 males e 327 females. The continent which produced more studies was Europe. The average overall score was 73%, with overall scores varying from 60% to 93%.

Conclusion: Few studies regarding this issue are conducted per year and none of the articles retrieved coupled an extensive participant’s age range with a high methodological score. Larger sample sizes are also required. However, there are balanced samples in terms of the gender of the participants and all but one article follow the ATS or ERS criteria, what shows an effort to standardize the spirometry tests. The conduction of similar studies in Portugal is suggested.

Keywords: Systematic review, Spirometry, Reference Values, Forced Expiratory Volume, Forced Vital Capacity

INTRODUCTION

Reference values are statistical standards obtained in large population studies that are used to determine a normal range of a clinical test’s results (Seaborg. 2007). A test result is considered normal if it falls within the range predicted for the age, sex, height, weight, and ethnic group of the patient. Therefore, reference values should be applied in populations closely related to the population of the study from which the normal ranges were obtained. It is crucial to have adequate reference values to reduce misdiagnoses (Seaborg. 2007).

Spirometry is a test that assesses pulmonary function using a spirometer, which measures the volume and flow of the forced exhaled air during a period of time (Lange. 2009; Mason. 2005; Ruppel. 2007). The Forced Vital Capacity or FVC (total amount of air exhaled) and the Forced Expiratory Value during the first second or FEV1are the two commonly used values. Spirometric parameters also vary with the individual characteristics (Kerstjens. 1997), therefore and ideally, they should be tailored to each person. Since there are no previous personal values at the time of diagnosis, the spirometric data is compared to values estimated from population studies: the reference values. Several authors pointed out to the fact that healthcare professionals are using inappropriate reference values for their target-populations, thus resulting in compromised diagnosis (Baur. 1999; Garcia-Rio. 2004; Ip. 2006; Kuster. 2008).

Our aims are to evaluate the methodological quality of the studies on spirometric reference values published between 1998 and 2008, according to a case-adapted STROBE checklist (von Elm. 2008). We have performed a systematic review (Liberati. 2009) and quality assessment of studies published in the last decade.

METHODS

Data sources

We have searched Pubmed, Scopus and ISI Web of Knowledge for articles published from 1998 to 2008, using queries with the MeSH terms ”Spirometry” and “Reference values”. Each database has different search engines and therefore three different queries were built. Table 1 shows the queries used

Table 1: Article search

Online databases / Query
Pubmed /
("1998"[Publication Date] : "2008"[Publication Date]) AND
("Spirometry"[tiab] AND ("Reference values"[mh] OR "Reference equations"))
ISI Web of Knowledge /
Scopus / (TITLE-ABS-KEY(spirometry) AND TITLE-ABS-KEY(“reference values” OR “reference equations” OR “normal values” OR “normative values”)) AND (LIMIT-TO(PUBYEAR, 2008) OR LIMIT-TO(PUBYEAR, 2007) OR LIMIT-TO(PUBYEAR, 2006) OR LIMIT-TO(PUBYEAR, 2005) OR LIMIT-TO(PUBYEAR, 2004) OR LIMIT-TO(PUBYEAR, 2003) OR LIMIT-TO(PUBYEAR, 2002) OR LIMIT-TO(PUBYEAR, 2001) OR LIMIT-TO(PUBYEAR, 2000) OR LIMIT-TO(PUBYEAR, 1999) OR LIMIT-TO(PUBYEAR, 1998))

Article selection (Appendix I)

The articles identified from our search were imported to the EndNote® X3 software to remove the duplicates. An exclusion phase took place, in which 11 reviewers were given each an equal group of article titles and checked its eligibility having in consideration the inclusion/exclusion criteria presented in Table 2. After that the group of articles given to each reviewer was randomly altered and the procedure was repeated. The same process was done afterwards, concerning the article’s abstracts. Disagreements were resolved by a different reviewer.

Data extraction and analysis

We have extracted to the PASW® Statistics 18 software data that identified each article: the author, year of publishing, title, number of participants and their gender, the minimum and maximum age of the participants, the accordance to the American Thoracic Society or the European Respiratory Society criteria (Hsia. 2010), the country where the study was conducted, equipment used, equations obtained, R2 and Residual Standard Deviation values. The countries in which the studies were conducted were recoded in a variable that grouped them by continents. The frequency table of the “Continent” variable was obtained. We calculated the total number of participants in each study and analyzed the median number of participants per study and per gender. The mean minimum and maximum age of the participants and the respective ranges were calculated.

Quality assessment

The included articles were scored using an adapted STROBE checklist (von Elm. 2008). The checklist items were given, for each article, the letter “S” (item achieved), “N” (item not achieved) or “N.A.” (item not applicable). The percentage of “S” items was calculated for each topic and for the entire article, along with the respective mean percentages, having all articles in account. The articles were scored a second time by a different reviewer, who followed the same method. In the case of item disagreements, a third person decided.

RESULTS

Article selection (Appendix 2)

From the 1308 articles found, 814 were from ISI Web of Knowledge, 218 from Pubmed and 276 from Scopus. From this total, 478 were duplicates, which left us with 830 articles to analyze. The exclusion phase eliminated 768 articles just by reading the title, which left us with 62 articles. From those, 4 were written in a language other than English or Portuguese, 10 had inappropriate age groups, 3 had inappropriate study designs, 4 did not calculate any new reference values, 18 were not related to the subject and 1 had occupational exposure. One of the abstracts was not available and the article was automatically excluded. From the final group of 17 articles, 8 full-texts were obtained. To obtain the remaining articles the authors were contacted by e-mail, and 2 authors sent the requested articles. One of them was excluded during the data extraction phase because of missing reference equations and the other was included. In the end, 9 articles were analyzed.

Agreement among reviewers: A total of 764 (92%) articles gathered agreement from the first two reviewers during the title exclusion while 66 (8%) didn’t. During the abstract selection, 48 (77%) gathered consensus and 14 (23%) did not.

Data extraction and analysis

The continent which produced more articles was Europe (44,4%, n=4), while America and Asia produced 2 (22,2%) and Oceania produced 1 (11,1%). The median number of participants was 627, and there are in median 326 male participants and 327 female participants per study. The average younger participant is 26 years old (ranging from 16 to 65) while the average older participant is 71 (ranging from 44 to 86).

Quality assessment

Each article was analysed and scored by two different people using an adapted version of the STROBE checklist (von Elm. 2008), and the obtained ratings are displayed in Table 3. The overall scored ranged from 60% (Memon et al. and Falaschetti et al.) to 93% (Smolej et al.), and the average rating was 73%. The partial scores indicate that the “Funding” and the “Results” topics had the lowest scores, with 44% and 58%, respectively. The “Introduction” (89%) and the “Discussion” (91%) had the highest scores.

Tabela 3: Partial and total STROBE scores

Title/abstract / Introduction / Methods / Results / Discussion / Funding / OVERALL SCORE
Gutierrez, 2004 / 100% / 100% / 63% / 56% / 100% / 0% / 69%
Roca, 2008 / 50% / 100% / 72% / 50% / 100% / 100% / 67%
Marsh, 2006 / 100% / 100% / 50% / 60% / 100% / 100% / 69%
Memon, 2007 / 100% / 100% / 64% / 30% / 100% / 0% / 60%
Falaschetti, 2004 / 100% / 100% / 73% / 20% / 100% / 100% / 60%
Smolej, 2008 / 100% / 100% / 100% / 80% / 100% / 100% / 93%
Langhammer, 2001 / 50% / 50% / 82% / 78% / 75% / 0% / 75%
Marion, 2001 / 100% / 50% / 82% / 78% / 75% / 0% / 72%
Boskabady, 2002 / 100% / 50% / 70% / 83% / 75% / 0% / 72%
AVERAGE SCORES / 83% / 89% / 73% / 58% / 91% / 44% / 73%

Agreement among reviewers: In average, the reviewers disagreed about one-third (33%) of the STROBE checklist items per article. Almost 100% (97%) of those disagreements happened in the “Methods” or “Discussion” topics items.

DISCUSSION

Few studies are conducted each year to obtain spirometric reference equations and they are mainly conducted in the developed countries. The articles analyzed show a balanced percentage of individuals from both genders. The median number of participants per article is 627, showing that it is desirable to conduct studies with larger sample sizes (e.g.: Roca, J. et al). The two best scored articles (Smolej et al, 93% and Marion et al, 88%), were conducted in populations whose equations were calculated for an age range that limited their use to the general population: 65 to 86 and 45 to 74 years old, respectively. The third best article (Langhammer et al., 75%) has a broader age range – from 20 to 80 years old -, and can be utilized in a larger number of people since it includes the younger adults. Having in consideration the age range, the best articles fail in that aspect. It would be useful to conduct studies with considerable samples of the population that covered all the adult age groups. The lowest scores in the STROBE topics regard the “Funding” (44%), the “Methods” (73%) and the “Results” (58%), which seems to be matter of concern since those two last topics are the most important in the study planning and execution, and may influence the equations that are obtained. We found that all but one study mention the criteria used in the conduction of the spirometry tests: they are methodologically standardized following the ATS or ERS criteria, in spite of the fact that there is no pair of articles whose spirometer is the same.

Since the STROBE checklist scores are difficult to interpret as there is no cut-off percentage to say whether a study is good enough or not, we can’t affirm that these scores reflect a poor study design that influenced negatively the obtained equations in the past decade. However, it is possible to state that we couldn’t find a study that was very good in terms of both methodological quality and sample size, so that its equations could be applied extensively in a certain population. None of the studies retrieved was conducted in Portugal and none describes the analyzed sample in such detail that it could be possible to state that there are similarities to the Portuguese population. It is then desirable to conduct an age-comprehensive study in our country.

We highlight the originality of this systematic review applied to the spirometry reference values. Similar and increasingly thorough studies should be conducted in this and other fields of Medicine in which reference values are used. The analysis of the way they are obtained through methods such as the one followed in this case (methodological quality assessment through the STROBE checklist) can become an important step to avoid the misapplication of the reference ranges and subsequent misdiagnosis in everyday Medicine.

ACKNOWLEDGMENTS

We acknowledge our adviser, Tiago Jacinto, M.D., and Altamiro Pereira, M.D., PhD for their help in the elaboration of this research work. We also thank the staff from the Biostatistics and Medical Informatics Department for the facilities made available.

REFERENCES

Baur, X., S. Isringhausen-Bley, et al. (1999). "Comparison of lung-function reference values." International Archives of Occupational and Environmental Health 2(72): 69-83.

Boskabady, M. H., M. Keshmiri, et al. (2002). "Lung function values in healthy non-smoking urban adults in Iran." Respiration 69(4): 320-326.

Falaschetti, E., J. Laiho, et al. (2004). "Prediction equations for normal and low lung function from the health survey for England." European Respiratory Journal 23(3): 456-463.

Garcia-Rio, F., J. M. Pino, et al. (2004). "Spirometric reference equations for European females and males aged 65-85 yrs." Eur Respir J 24(3): 397-405.

Greenhalgh, T. and R. Peacock (2005). "Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources." BMJ 331(7524): 1064-5.