Meta-Analysis on LCA of Forest Wood Products: Rationale and Methods
Nuno Lapa and Bart Muys
1. Brief review on methodologies of systematic reviews and meta-analyses
Many authors from different fields of research, such as Behavior Sciences, Medicine, Education Sciences, Biology and Ecology, among others, have addressed some efforts in attempting to define a methodology to perform systematic reviews and meta-analyses in their research fields; some examples are the works of Hedges (1992), Rosenthal and DiMatteo (2001), Field (2003), and Pullin and Stewart (2006).
As systematic reviews and meta-analyses are being performed in Health Sciences and Medicine since many years, the methodologies in these research fields are very well established, not only in what concerns the procedures to perform the bibliographic review and the coding process, but also in what deals with the statistical methods that must be used in different cases. Some of the most important methodologies are summarized hereafter.
Lipsey and Wilson (2001) have dedicated their efforts to write a book on how to apply meta-analysis in reviewing bibliography. They have detailed the meta-analysis protocol in the following five steps:
a) Defining the problem and retrieving the bibliographic references – In this level, the meta-analyst must define clearly the problem in which the meta-analysis will be focused and its objective. The strategy to get the bibliographic references of interest to fit the objective must also be clearly defined. Then, the process of identifying the bibliographic references must be performed and, if necessary, adapted based on the quality of the studies retrieved and their adequacy to the objective of the meta-analysis. In this step, very clear criteria on the selection of studies to be included in the meta-analysis must be used, in order to demonstrate the transparency of the study, and to avoid bias due to the selection procedure of bibliographic references. Although this is a very important step of the systematic reviews and meta-analyses, Lipsey and Wilson (2001) were not very detailed on discussing these criteria.
b) Coding the bibliographic references – In order to keep an up-to-date and easily accessible database on the retrieved bibliographic references, the meta-analyst must define a coding scheme of the bibliography and of their technical content, and construct a database. For example, the MS Access can be used to construct this database. Although the majority of the coded variables shall be defined in the beginning of the meta-analysis, new variables can be added as the bibliographic references are being analyzed and new information is obtained from them.
c) Selecting and computing the effect size statistic – The statistical variable or variables, named as the effect size statistic, to be quantified has to be selected according to the objective of the meta-analysis and the characteristics of the studies comprised in the bibliographic references to be meta-analyzed. Then, the information coded in the database must be retrieved and the effect size statistic obtained directly from the database or calculated if it is not available from the bibliographic references.
The effect size statistics can be as simple as the arithmetic mean (eq. 1.1) with its standard error (eq. 1.2) and the inverse variance weight (eq. 1.3), or as complex as the product-moment of the correlation coefficient r, using the Fisher’s Zr-transform (eq. 1.4), with its standard error (eq. 1.5) and the inverse variance weight (eq. 1.6) (Lipsey and Wilson, 2001). The equations 1.1 to 1.6 are as follows:
ESm=X=xin / (eq. 1.1)where ESm is the effect size statistic based on the mean, xi is the sum of the values of the statistic variable xi, considered important for the review, in the set of the i bibliographic references retrieved, and n is the number of data used to calculate the mean, for example the number of bibliographic references used to calculate the mean.
SEm=sn / (eq. 1.2)where SEm is the standard error of the effect size statistic ESm, s is the standard deviation of the statistic variable xi, and n has the same meaning as defined above.
wm=1SEm2=ns2 / (eq. 1.3)where wm is the inverse variance weight of the effect size statistic ESm. The other variables have the same meaning as defined above.
ESZr=0.5.ln1+r1-r / (eq. 1.4)where ESZr is the effect size statistic based on the Fisher’s Zr-transform of the correlation coefficient r of two variables x and y, and ln is the natural logarithm.
SEZr=1n-3 / (eq. 1.5)where SEZr is the standard error of the effect size statistic ESZr, and n is the total sample size used to estimate the correlation coefficient r.
wZr=1SEZr2=n-3 / (eq. 1.6)where wZr is the inverse variance weight of the effect size statistic ESZr, and n has the same meaning as defined for eq. 1.5.
Other effect size statistics are defined by Lipsey and Wilson (2001), but they cannot be summarized here. For further information, please see the original bibliographic reference.
Nevertheless, Lipsey and Wilson (2001) have defined the problem of the selection of the effect size statistic as being related with the variables retrieved from the bibliographic references that can be subjected to a change in a certain moment due to an experimental approach, a technological modification, a methodological change, or other. For example, if the meta-analysis is focused on the study of GHG emissions from a specific industrial process without being considered any other variables, such as technological modifications, this problem can be considered as a “One Variable Relationship” or a “Central Tendency Description” problem, and the effect size statistics that may be used are the proportions (direct or logit methods) or the arithmetic mean. But, if the systematic review or meta-analysis wants to know the GHG emissions from a specific industrial process taking into account that in a certain moment a technological modification was introduced in that process, then this can be viewed as a “Two Variable Relationship” or a “Pre-Post Contrast” problem, and the effect size statistics that can be used are the unstandardized or the standardized mean gain.
Three other types of meta-analyses are identified by Lipsey and Wilson (2001), namely the “Two Variable Relationship – Group Contracts”, “Two Variable Relationship – Association between Variables” and “Multivariate Relationships”. For further details on the effect size statistics for these types of systematic reviews or meta-analyses, please refer to Lipsey and Wilson (2001).
d) Performing the meta-analysis – Lipsey and Wilson (2001) define a specific statistic strategy to perform the meta-analysis that comprises four steps:
i) Effect size adjustment – In many meta-analyses, it can be appropriate to adjust the individual effect size statistics for bias, artifact, or error prior to the statistical analysis. The most used effect size statistics have transformations or bias corrections that must be applied. The Fisher’s Zr-transform of the correlation coefficient r is one of those transformations. For further details, please see Lipsey and Wilson (2001).
ii) Analysis of the effect size mean and distribution – This is the core of the meta-analysis. In this step it is necessary to (a) generate a set of independent effect size statistics, (b) calculate the weighted mean of each effect size, weighting by the inverse variance weights (eq. 1.7), and its standard error (eq. 1.8) (c) determine the confidence interval of the weighted mean (eqs. 1.9 and 1.10), and (d) analyze the homogeneity of the distribution (eq. 1.11).
ES=i=1kwi.ESii=1kwi / (eq. 1.7)where ES is the weighted mean of each effect size, ESi are the individual values of the effect size statistic used, wi is the inverse variance weight for effect size i, and k is the total number of effect sizes.
SEES=1i=1kwi / (eq. 1.8)where SEES is the standard error of the weighted mean ES, and the other variables have the same meaning as referred for eq. 1.7.
ESU=ES+z1-∝. SEES / (eq. 1.9)ESL=ES-z1-∝. SEES / (eq. 1.10)
where ESU and ESL are the upper and lower limits, respectively, of the confidence interval of the weighted mean ES, z1-∝ is the critical value of the z-distribution for a confidence level of 1-∝ (for example, z1-∝=1.96 for ∝=0.05, and z1-∝=2.58 for ∝=0.01), and SEES has the same meaning as defined for eq. 1.8.
In the homogeneity analysis step, it is necessary to test whether the various effect sizes that have generated the weighted mean ES all estimate the same population effect size, i.e., in this step it is analyzed the homogeneity of the statistical effect sizes. The homogeneity analysis uses the Q statistic test (eq. 1.11), which is distributed as a chi-square with (k-1) degrees of freedom (k is the number of effect sizes).
Q=i=1kwi.ESi-ES2 / (eq. 1.11)where the variables have the same meanings as stated in the previous equations. If the Q value exceeds the chi-square critical value with (k-1) degrees of freedom (significant Q value), then the null hypothesis of homogeneity is rejected, and the analysis of heterogeneous distribution becomes necessary (please, consult a standard statistic textbook to get the critical chi-square value from a chi-square table). In this case, the meta-analyst may conclude that the differences between the effect size statistics are due to errors within the studies used in the meta-analysis (subject-level sampling error) and to variations between studies (study-level sampling error) such as different methodologies used.
It is also important to note that the Q statistic test is a fixed effect model, and assumes therefore, when the null hypothesis of homogeneity are accepted (Q value < chi-square critical value; not significant Q value), that the effect sizes are estimated with an error only due to subject-level sampling error. However, if the Q value is calculated by using a low number of effect sizes (low degrees of freedom), or if these were obtained through a low number of samples (effect size statistics with higher error), the Q statistic test has a low statistic power, and may fail in both the acceptation and rejection of the null hypothesis. Thus, an analysis on the heterogeneity of the distributions is always necessary regardless the result of the Q test.
iii) Analysis of heterogeneous distributions – According to Lipsey and Wilson (2001) there are three main options that the meta-analyst may follow to analyze the heterogeneity of the distribution, which are as follows:
(a) To assume that the variability beyond subject-level sampling error is random, i.e., it derives from random differences among studies whose sources cannot be identified. In this situation, the meta-analyst should adopt a random effect model (eq. 1.12 and 1.13);
(b) To assume that the variability beyond subject-level sampling error is systematic and derived from identifiable differences between studies. This is a fixed effect model, but it is assumed that the variance between effect size statistics can be identified. Therefore, this model is currently named as “partitioning effect size variance”. In this case, the meta-analyst tries to identify the effect size statistics that can contribute to the variability and aggregates them in groups. Then, the meta-analyst may apply a statistical procedure similar to the ANOVA test (Analysis of Variance), in which the Q value is split in a QB value (eq. 1.14) that represents the Q value between groups, and in a Qw value (eq. 1.15) that represents the pooled Q within groups;
(c) If the Q values that were calculated through a random model or a partitioning effect size variance model are still statistically significant (not homogeneous), then beyond the variance within the studies, and the systematic variance between the studies, there is random variance which remains unmeasured. Therefore, the meta-analyst must use a mixed effect size model. Fitting a mixed effect model to effect size data is similar to a random method. For further details, please see Lipsey and Wilson (2001).
ϑi*=ϑθ+ϑi / (eq. 1.12)where ϑi* is the total variance of the effect size statistic, ϑi is the variance associated with subject-level sampling error (variance within studies) and can be computed as defined in the equations above for the respective effect size, and ϑθ is the estimate of the random variance (variance between studies). One possibility to calculate ϑθ is by using the method of moments as defined in eq. 1.13:
ϑθ=Q-(k-1)wi-wi2wi / (eq. 1.13)where Q is the value of the homogeneity test described above, k is the number of effect sizes, and wi is the inverse variance weight for each effect size as described above.
QB=wj.ESj2-wj.ESj2wj / (eq. 1.14)where QB is the Q between groups, ESj is the weighted mean effect size for each group, wj is the sum of the inverse variance weights within each group, and j equals 1, 2, 3, …, up to the total number of groups.
Qw=wi.ESi-ESj2 / (eq. 1.15)where Qw is the pooled Q within groups, ESi is the individual effect size, ESj is the weighted mean effect size for each group, wi is the inverse variance weight for each effect size, i equals 1, 2, 3, …, up to the total number of effect sizes, and j equals 1, 2, 3, …, up to the total number of groups.
iv) Additional statistical analysis – According to Lipsey and Wilson (2001), some additional statistical analysis may be needed, such as the analysis of statistically dependent effect sizes or the moderator analysis. As these analyses might not have significant importance to LCA studies, they will not be discussed here. For further details, please see Lipsey and Wilson (2001).