Green S, Higgins JPT, Alderson P, Clarke M, Mulrow CD, Oxman AD. Chapter 1: Introduction. Cochrane Handbook of Systematic Review of Interventions (Higgins JPT, Green S, Eds.). John Wiley & Sons, Chichester, UK: 2008.
- Systematic review = “reviews of a clearly formulated question that use explicit methods to identify, select, and critically appraise relevant research, and to collect and analyze data from the studies that are included in the review”.
- Features
o Clearly stated set of objectives
o Pre-defined eligibility criteria for studies
o Explicit, reproducible methodology
o A systematic search that attempts to identify all eligible studies.
o An assessment of the validity of the findings of the included studies
o A systematic presentation, and synthesis, of the characteristics and findings of the included studies.
- Systematic reviews may or may not synthesize studies with quantitative meta-analyses.
Grimshaw J, McAuley LM, Bero LA, Grilli R, Oxman AD, Ramsay C, Vale L, Zwarenstein M. Systematic review of the effectiveness of quality improvement strategies and programs. Qual Saf Health Care, 12(4): 2003.
- Developing and following a detailed protocol protects against the potential bias, that review results may be influenced by the data.
- Choose an estimation as opposed to a simple dichotomous hypothesis testing approach.
o Policy makers are interested in “how much”
o It avoids fallacies of p-value interpretation.
- Cluster RCTs are the most robust design.
- Quasi-experimental (not quasi-randomized) designs – ITS and CBA studies – may be necessary for certain interventions (e.g.: mass media campaigns, large scale policy initiatives)
- Reviewers must clearly define the intervention, because there is a lack of generally accepted classification or nomenclature.
- Lumping vs splitting
o Lumping – across settings and professionals, or across clinical outcomes –
§ Intended to identify the common generalizable features within similar reviews
§ Minor differences in study characteristics may not be crucially important, and with enough studies, may be treated as statistical noise.
§ Assess consistency of research findings across a wider range of studies and contexts – reduces the risk of bias or chance results.
§ Differences may be explored explicitly by stratification or by meta-regression.
o Splitting – It is only appropriate to include studies which are very similar in design, population, intervention, and outcome ascertainment.
§ E.g.: Jamtvedt et a. – SR of audit and feedback – only 3 studies relevant to diabetes – vulnerable to badly designed studies or chance findings – much more learnt from the overall SR.
- Identification of evidence – EPOC special register.
- Quality assessment – must include cluster RCTs and quasi-experimental designs.
- Multiple outcomes issues
- Unit of analysis issues
- Meta-analysis
o SRs often include studies exhibiting greater variability or heterogeneity due to differences in intervention operationalization, targeted behaviors, professionals, and study contexts.
o Meta-analysis may result in an artificial result which is potentially misleading, and of limited value.
- A qualitative synthesis may be preferred.
o One method is vote counting
§ Positive studies – ignore the effect size, and give equal weight to studies of varying precision.
o Alternatives – median effect size, etc.
- Exploring heterogeneity
o A priori modifiers explored using
§ Tables
§ Bubble plots
§ Whisker plots
o Meta-regression
o Associations may be spurious – over-fitting and confounding are dangers – essentially an observational study of studies.
- Quality issues in SRs (Grimshaw JM, Shirran L, Thomas R, Mowatt G, Fraser C, Bero L, Grilli R, Harvey E, Oxman A, O’Brien MA. Changing provider behavior: an overview of systematic review of interventions. Medical Care, 39(8 Suppl 2): 2001).
- Main limitation to QI SRs is the quality of evaluations of QI strategies.
Mullen PD, Ramirez G. The promise and pitfalls of systematic reviews. Annual Review of Public Health, 27: 2006.
- Objective – Provide a balanced commentary on the SR and public health – its achievements, promise, and pitfalls.
- Rationale for SRs
o Growing body of literature
o Lack of rigor and scientific method in the traditional subjective review.
o Recognition that reviews are among the most frequently cited reports – influential.
o Recognition that repeated studies are often performed – yet even when strict replication is attempted, results across studies are rarely identical. Locating and integrating them therefore involves inferences as central to validity as those involved in the primary studies themselves.
- Steps and nature of an SR – should parallel primary studies in methods, and measures taken to maintain transparency and replicability.
- Benefits of SRs
o Improved reporting of primary studies
§ CONSORT guidelines
§ Structure abstracts
§ Movement to confidence interval estimation and reporting of precision (as opposed to reject or fail to reject) in light of meta-analysis
o Improved precision
§ More efficient use of existing studies
§ Forego additional, unnecessary studies (i.e.: cumulative meta-analyses) (e.g.: IV streptokinase)
o Increased validity of estimation – sources of heterogeneity become random noise.
§ Only 50% of the variance in ES in a secondary analysis of over 100 meta-analyses was due to the actual treatment effect – study method and sampling error contributed 21% and 26%.
§ A single study will not typically provide a trustworthy indication of the effectiveness of a particular treatment.
o Better understanding of reporting and publication biases.
§ Development of controlled trials registries.
o Advice for the performance of primary studies, e.g.: importance of blinding.
§ Determined from meta-regression and stratified studies, and systematic measurement of study design features.
o Determination of evidence gaps
§ E.g.: Community Guide reviews often find lack of economic findings, or identify important populations missed by extant studies.
§ E.g.: Studies of adverse effects.
o Meaningful exploration of heterogeneity to generate new hypotheses – or confirm hypotheses regarding modifiers of intervention effectiveness.
§ Meta-regression analyses
§ SRs help explain why results might be “mixed” – systematic collection of data on study, intervention, setting, and participant characteristics.
- Pitfalls of SRs
o SRs vary in quality – there are points where subjective judgment must be exercised – SRs should be interpreted critically as well as primary studies.
§ QUORUM guidelines.
o SR evidence may be a-contextual, difficult to interpret from an implementation perspective.
§ Impedes uptake – researchers and policy makers operate in different contexts, motivated and constrained by different imperatives.
§ Narrow notion of “evidence” does not capture all that is of interest.
§ Questions selected may not be relevant – especially RE – processes and applicability of complex interventions.
§ Meta-analyses may obscure important contextual differences between studies – a qualitative interrogation of the data may sometimes be more appropriate (e.g.: Norris’s review of case and disease management).
§ Lipsey – pointed out that innitiatives such as the Community Guide assume that programs conducted as demonstration projects under conditions that facilitate the research will provide the same benefits in practice.
O’Connor D, Green S, Higgins JPT (Eds.). Chapter 5: Defining the review question and developing criteria for including studies. Cochrane Handbook of Systematic Review of Interventions (Higgins JPT, Green S, Eds.). John Wiley & Sons, Chichester, UK: 2008.
- Well-formulated questions will guide many aspects of the SR.
- Questions usually translate directly into eligibility criteria.
- A well-formulated question includes consideration of PICOS.
o People and populations
§ Disease and conditions
§ Broad population and setting of interest
§ Cochrane reviews should be globally relevant. Exclusion of studies based on population characteristics should be explained – scientific, practical, or heterogeneity concerns
§ Sub-groups should be included, with exploration of important differences in analysis, if it is uncertain whether important differences in effects exist.
§ Operational difficulties – e.g.: studies of patients with type 2 diabetes – what to do with studies that do not distinguish diabetes type?
o Interventions – which comparisons to make?
§ Comparators
· Active
· Inactive
§ Drugs – route of administration, dose, duration, frequency
§ Complex interventions – core features, intensity, frequency, personnel, training.
§ Operational difficulties – e.g.: How will variations, incompletely implemented, or co-interventions be handled?
o Outcomes
§ Should include all outcomes that are likely to be meaningful to a broad audience – at least 1 adverse outcome should be specified, and economic outcomes are helpful.
§ Eligibility is not usually conditional on outcomes reported.
§ Including all important outcomes allows the review to highlight gaps in the primary research.
§ Method of measurement (objective or subjective), timing of measurements.
§ Consistency with other reviews may be desireable.
§ Avoid surrogate and trivial outcomes.
§ Operational difficulties – e.g.: composite outcomes
§ Prioritize
· Main outcomes – primary and secondary.
· Non-main outcomes – secondary.
o Types of study
§ RCTs
· Unbiased
· Able to leverage current resources of the Cochrane Collaboration, and other search filters.
· Consider cluster and cross-over designs – include?
§ Trade off between sparsity of evidence and risk of bias.
o Scope
§ Scope (broad vs narrow) decisions must be made for
· Choice of participants
· Definition of an intervention
· Choice of interventions and comparisons
§ Generally, broad scoping allows
· Comprehensive summaries of the evidence
· Ability to generalize findings across participants or interventions.
· However – higher risk of mixing apples and oranges – heterogeneity complicated interpretation.
§ Narrow scoping is
· Manageable
· However – evidence may be sparse, therefore imprecise or vulnerable to bias.
· Findings may not generalize well.
· Scope may be chosen to produce a particular result.
Higgens JPT, Deeks JJ (Eds.). Chapter 7: Selecting studies and collecting data. Cochrane Handbook of Systematic Review of Interventions (Higgins JPT, Green S, Eds.). John Wiley & Sons, Chichester, UK: 2008.
- Screening and selection – Typical process
1. Examine titles and abstracts to remove obviously irrlelvant reports.
2. Examine full-text reports
3. Make final decisions.
- Measuring agreement – kappa
o Kappa = (Po – Pe) / (1 – Pe)
o Pe = expected probability of agreement, calculated by multiplying the marginal probabilities.
o Kappa is a chance-adjusted measure of agreement.
o Cut-point (Landis and Koch)
§ Fair – [0.21, 0.40]
§ Moderate – [0.41, 0.60]
§ Substantial – [0.61, 0.80]
- Data collection – what to collect?
o Basic study design characteristics
o Features required to evaluate risk of bias.
o Aspects of participants and setting that could affect the presence or magnitude of an effect
o Aspects of participants and settings that affect generalizability and applicability.
o Diagnostic criteria used to define conditions of interest.
o Intervention details that could affect the presence or magnitude of the effect, or that could help users assess applicability or implementation.
o Intervention integrity – the extent to which specified procedures or components of the intervention as implemented as planned.
§ Information needed to distinguish poorly conceptualized interventions from incomplete intervention delivery.
§ Adherence (Interventions components delivered as prescribed)
§ Exposure (e.g.: frequency, intensity, time span of intervention)
§ Quality of delivery
§ Participant responsiveness (participant process outcomes)
§ Program differentiation (lack of contamination)
- Data collection – Outcome measures
o Definition of outcome, timing of measurement, unit of measurement, scale characteristics (including validation citation)
o Adverse events
o Not necessary to collect results for all outcome – but may be useful to list them to assist evaluation of risk of bias due to selective outcome reporting.
- Data collection – Results
o Collect sample sizes at randomization separate from those at analysis.
- Extraction
o Duplicate extraction – indirect evidence – one study observed that independent data extraction by two authors resulted in fewer errors than an extract-and-confirm approach. Extraction errors are common.
- Results – conversions
o Meta-analysis can proceed from either the outcome measures, or from measures of association. In both instances, measures must be accompanied by an estimate of its variance.
o Dichotomous data – outcome measures – need 2x2 table at analysis
o Continuous data – outcome measures – need N, mean and SD at analysis for both groups.
§ SD from SE – SE = SD / root(N) so SD = SE * root(N)
§ SD from CI – SD = root(N) * (UCL – LCL) / (2*invnorm(1-(alpha/2))), if the 95% CI is given.
§ Use invttail for small sample sizes, since CIs and SEs were probably calculated using a t-distribution.
§ The standard deviation for each group may be calculated from statistics for the difference in means if it is assumed that the SD is the sample between groups.
· Convert P-value to T-statistic using the invttail function.
· Convert T-statistic to SE using T = mean diff / SE.
· Convert SE to SD = SE / root((1/N1) + (1/N2)).
§ Medians, IQRs, and ranges should not be used to impute means and SDs. While these measures hold relations to means and SDs for normally distributed data, they are often employed precisely because the data are skewed.
· Normally distributed data = IQR width = 1.35 * SD
§ SDs may be imputed from other studies, if no information is available.
o Continuous data measured as change scores
o Count data
§ Treat as dichotomous (need proportion experiencing at least 1 event)
§ Treat as continuous – but may be highly skewed.
§ Treat as time-to-event data.
§ Treat as rate data – ideal – because the Poisson distribution variance is equal to mean, all that is needed is the total number of events in each group and the total amount of person time accrued.
o Time-to-event data
§ Not possible to meta-analyze the cumulative probability provided by a Kaplan-Meier estimate.
§ Ideal – hazard ratio analysis from CPH model or study re-analysis using IPD.
§ Otherwise, estimate log HR = (O-E)/V, which has SE = 1/root(V), where O = observed number of events, E = log-rank expected number of events, O-E = log-rank statistic, and V = variance of the log-rank statistic.
o Analysis of measures of association
§ The measure of association is provided in the report.
§ The SE of the estimate is needed.