Correcting for reliability 3
Running head: CORRECTIONS IN META-ANALYSIS
Correcting for Reliability and Range-Restriction in Meta-Analysis
Frederick L. Oswald
and
Patrick D. Converse
Michigan State University
Frederick L. Oswald and Patrick D. Converse, Department of Psychology, Michigan State University. Symposium presented at the 20th Annual Conference of the Society for Industrial and Organizational Psychology, Los Angeles, CA. Please direct all correspondence to Fred Oswald, Psychology Building, Michigan State University, East Lansing, MI 48824-1116; e-mail:
Correcting for Reliability and Range-Restriction in Meta-Analysis
For more than 25 years, meta-analysis has been used as a quantitative tool in organizational research for summarizing empirical findings across studies, complementing the narrative review by demonstrating whether validities across research conducted in a given domain show consistent levels (a lack of situational specificity) or at least are non-zero (validity generalization). Part of the process in reaching such conclusions is to correct observed effect sizes for statistical artifacts that systematically attenuate observed predictor-criterion correlations (see Hunter & Schmidt, 2004). Measurement unreliability is pervasive in psychological measures, so given a reasonable estimate of true-score variance for a measure, that value can be substituted for the observed variance, hopefully providing a more accurate estimate of effect in the population of interest. Range restriction is another effect attenuating observed correlations in personnel selection settings, so that an estimate of effect in the unrestricted population will be larger than in the restricted sample. Range restriction may be direct, where individuals in the restricted sample were selected top-down on the predictor measure, or it may be incidental, where selection occurs on a third variable that is correlated with the predictor and/or the criterion. Having a substantive understanding of the nature of error in one’s measures and the selection process that led to the observed effect size helps lead one toward making appropriate statistical corrections. Of course, these corrections can be informative when estimating population correlations in one’s own study – no study needs to wait for a meta-analysis to make such corrections.
Several articles have outlined the computational details for different methods of statistically correcting correlations and computing their standard errors (Bobko & Riecke, 1980; Hunter, Schmidt, & Le, 2004; Raju & Brand, 2003; Sackett & Yang, 2000). The present paper outlines still another method whose rationale is brief but justifies a compelling alternative to previous approaches. Then, several different correction approaches will be applied to examples that vary the typical values found in psychological research for measurement reliability, direct range restriction, and criterion-related validity.
Rationale
The proposed method for correcting correlations takes into account the fact that personnel selection studies usually have incumbent data on an organizational criterion of interest; they do not have criterion data for applicants who did not get hired. Additionally, because incumbents were selected on a predictor often correlated with these criteria, the criterion reliability in the incumbent sample typically underestimates the criterion reliability in the applicant population due to this incidental range restriction. In other words, selection researchers often have the predictor reliability in the applicant sample, but only have the restricted criterion reliability in the incumbent sample. An appropriate simultaneous correction for (a) direct range restriction on the predictor and (b) measurement unreliability requires the estimate for the unrestricted reliability coefficient for the criterion (Stauffer & Mendoza, 2001).
Stauffer and Mendoza (2001) advance a related problem in terms of predictor reliability, when only the restricted reliability for the predictor is known. More importantly, they make the general argument that simultaneous correction requires the range-restriction correction based on the observed (restricted) correlation as well as the unrestricted reliability coefficients. The following method explains how to estimate the unrestricted criterion reliability and then applies the simultaneous correction formula.
Method
An estimate of the unrestricted criterion reliability coefficient can be obtained in the following steps. First, Gulliksen provides an equation relating criterion reliability to validity under direct range restriction on the predictor (Gulliksen, 1950, Eq. 22, p. 140). He shows that, under Classical Test Theory assumptions of (a) equal standard errors of measurement and (b) equal standard errors of the estimate regardless of the amount of range restriction on an applicant sample, that the following ratio should equal a constant C:
(1)
where and are the range-restricted criterion reliability and validity coefficient, respectively. This constant must also hold when there is no range restriction, such that
(2)
where and are the unrestricted criterion reliability and validity coefficient, respectively. Then, setting Equations 1 and 2 equal to one another and rearranging, the formula for the unrestricted criterion reliability becomes
. (3)
This equation has two known values, and , and two unknown values, and . can be estimated, however, using the traditional formula for direct range restriction on the predictor x (Gulliksen, Eq. 18, p. 137):
(4)
where all terms are defined as before, and k is the ratio of unrestricted to restricted standard deviations for x. Before we substitute results from this equation into Equation 3, first let us recall the formula for the correlation corrected for measurement unreliability:
(5)
Where is the validity coefficient corrected for range restriction, is the estimate of predictor reliability, and is the estimate of criterion reliability corrected for incidental range restriction. Again, Stauffer and Mendoza (2001) note that range restriction corrections are based on the nature of the range restriction, not on the nature of the type of reliability coefficient available. Therefore, corrects the observed validity coefficient, , for range restriction, where the observed validity coefficient is not corrected for measurement unreliability.
Therefore, Equation 4 substitutes for in the numerator of Equation 5. For the denominator of Equation 5, is usually available from applicant sample data, and has been estimated in Equation 3, which together yields the following:
(6)
Equation 6, in turn, requires substituting Equation 5 for in the denominator. Then with this substitution and a fair amount of algebraic rearrangement, Equation 6 becomes
(7)
This equation allows one to estimate , the correlation corrected for range restriction and measurement error variance, based on the available estimates: (a) the unrestricted predictor reliability, ; (b) the incidentally range-restricted criterion reliability, ; and (c) the restricted validity coefficient .
Empirical Comparisons
Table 2 empirically compares methods applied to examples that combine different levels of the observed validity (rxy = .1, .2, .3), selection ratio (SR = 20%, 50%), observed (unrestricted) applicant predictor reliability (Rxx = .7, .9), and observed (incidentally restricted) incumbent criterion reliability (ryy = .5, .7). All factors are completely crossed yielding 24 combinations.
Two correction methods were compared (see Table 1 for a summary of them). The Proposed Method reflects the approach just described: the correction first applies the direct range restriction formula to the observed correlation, which is then corrected for by the unrestricted predictor reliability and the unrestricted criterion reliability, the latter of which is estimated from the incidentally restricted criterion reliability. The Rule-of-Thumb Method, suggests first correcting the range-restricted observed validity by the range-restricted criterion reliability, then correcting that value for range restriction and the unrestricted predictor reliability (e.g., see Raju, Burke, Normand, & Langlois, 1991).
Table 1
Summary of Two Methods for Simultaneous Corrections to the Observed Correlation
Range restriction is based on… / Predictor reliability coefficient is… / Criterion reliability coefficient is…Method 1 –
Proposed Method / Observed correlation / Observed and unrestricted / Estimated and unrestricted (corrected for incidental range restriction)
Method 2 –
Rule-of-thumb Method / Corrected correlation (corrected by the restricted criterion reliability) / Observed and unrestricted / Observed and incidentally range restricted (used in the range restriction correction)
Results and Discussion
Given the call by Stauffer and Mendoza (2001) to base range-restriction corrections on observed validities, then correcting those by unrestricted predictor and criterion related validities (the latter being estimated), we fully expected to find empirical differences in the Proposed Method, which is based on their recommendation, with the Rule-of-Thumb method. However the findings we did obtain were exactly the same (see Table 2), which led us to believe – and confirm – that these two methods are completely equivalent algebraically. What this means is that the spirit of the message by Stauffer and Mendoza (2001) is correct and perhaps has gone largely unrecognized, but in the end we found out (the hard way) that following their call led to results that are no different from the rule-of-thumb approach that meta-analysts already adopt when correcting correlations individually for statistical artifacts.
Table 2
A Comparison of the Proposed Method with the Rule-of-Thumb Method
corrected-rxycase / rxy / SR / Rxx / ryy / Proposed Method / Rule-of-Thumb
1 / 0.1 / 0.2 / 0.7 / 0.5 / 0.3492271 / 0.3492271
2 / 0.1 / 0.2 / 0.7 / 0.7 / 0.2980042 / 0.2980042
3 / 0.1 / 0.2 / 0.9 / 0.5 / 0.3079893 / 0.3079893
4 / 0.1 / 0.2 / 0.9 / 0.7 / 0.2628150 / 0.2628150
5 / 0.1 / 0.5 / 0.7 / 0.5 / 0.2756176 / 0.2756176
6 / 0.1 / 0.5 / 0.7 / 0.7 / 0.2340742 / 0.2340742
7 / 0.1 / 0.5 / 0.9 / 0.5 / 0.2430719 / 0.2430719
8 / 0.1 / 0.5 / 0.9 / 0.7 / 0.2064340 / 0.2064340
9 / 0.2 / 0.2 / 0.7 / 0.5 / 0.6375673 / 0.6375673
10 / 0.2 / 0.2 / 0.7 / 0.7 / 0.5568183 / 0.5568183
11 / 0.2 / 0.2 / 0.9 / 0.5 / 0.5622815 / 0.5622815
12 / 0.2 / 0.2 / 0.9 / 0.7 / 0.4910676 / 0.4910676
13 / 0.2 / 0.5 / 0.7 / 0.5 / 0.5252105 / 0.5252105
14 / 0.2 / 0.5 / 0.7 / 0.7 / 0.4518904 / 0.4518904
15 / 0.2 / 0.5 / 0.9 / 0.5 / 0.4631921 / 0.4631921
16 / 0.2 / 0.5 / 0.9 / 0.7 / 0.3985299 / 0.3985299
17 / 0.3 / 0.2 / 0.7 / 0.5 / 0.8459926 / 0.8459926
18 / 0.3 / 0.2 / 0.7 / 0.7 / 0.7586787 / 0.7586787
19 / 0.3 / 0.2 / 0.9 / 0.5 / 0.7460953 / 0.7460953
20 / 0.3 / 0.2 / 0.9 / 0.7 / 0.6690917 / 0.6690917
21 / 0.3 / 0.5 / 0.7 / 0.5 / 0.7334763 / 0.7334763
22 / 0.3 / 0.5 / 0.7 / 0.7 / 0.6422888 / 0.6422888
23 / 0.3 / 0.5 / 0.9 / 0.5 / 0.6468653 / 0.6468653
24 / 0.3 / 0.5 / 0.9 / 0.7 / 0.5664455 / 0.5664455
Note. rxy = observed validity coefficient. SR = selection ratio, Rxx = unrestricted predictor reliability, ryy = incidentally restricted criterion reliability.
This study’s focus was on correcting individual studies in a meta-analysis, considering the incidental range-restriction effect on criterion reliability. We did not consider cases where predictor reliability was only available for the sample of incumbents who were directly selected on the predictor; we assumed that predictor information would be available for all applicants. Our continuing work will investigate this, however, as this may in fact lead to results differing from the rule-of-thumb method (see Stauffer & Mendoza, 2001, p. 65, Eqs. 4 and 5).
It is important to note some important boundary conditions in this work confirming the rule-of-thumb approach. First, we did not investigate corrections to correlations that use artifact distributions, an approach adopted when the set of studies in a meta-analysis do not report complete psychometric information (Hunter & Schmidt, 2004). Second, correcting for incidental range restriction on the predictor, rather than direct range restriction, is an important concern in cases where incumbents were selected explicitly on a variable correlated with the predictor of interest. We did not model that effect in this study because it has been thoroughly investigated in detailed simulations elsewhere (see Le, 2003).
In conclusion we want to make the general point that researchers, before conducting a meta-analysis, should develop a deep familiarity with the measures and settings used in the research domain of interest. Additionally, researchers should be well aware that meta-analysis is subject to many of the threats to validity that individual studies face (Shadish, Cook, & Campbell, 2002). Given that, statistical corrections should not be applied mechanically (see Oswald & McCloy, 2003), but where they can be applied appropriately, it pays off by increasing accuracy of the point estimates that also happens to offset the increase in the associated standard error (Bobko & Riecke, 1980). The present study showed that meta-analysts employing the traditional rule-of-thumb method that uses an incidentally range-restricted criterion reliability coefficient (and an unrestricted predictor reliability coefficient) achieves the same results as the proposed method that follows Stauffer and Mendoza’s (2001) recommendation by correcting for this incidental range restriction before applying a simultaneous correction to the observed validity.
References
Bobko, P., & Riecke, A. (1980). Large sample estimates for standard errors of functions of correlation coefficients. Applied Psychological Measurement, 4, 385-398.
Gulliksen, H. (1950). Theory of mental tests. New York: Wiley.
Hunter, J. E., Schmidt, F. L., & Le, H. (2004). Implications of direct and indirect range restriction for meta-analysis methods and findings. Unpublished manuscript
Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings. Thousand Oaks, CA: Sage.
Le, H. A. (2003). Correcting for indirect range restriction in meta-analysis: Testing a new meta-analytic method. Unpublished doctoral dissertation, University of Iowa, Iowa City, IA.
Oswald, F. L., & McCloy, R. A. (2003). Meta-analysis and the art of the average. In K. R. Murphy (Ed.), Validity generalization: A critical review (p. 311-338). Mahwah, NJ: Erlbaum.
Raju, N. S., Burke, M., Normand, J., & Langlois, G. M. (1991). A new meta-analytic approach. Journal of Applied Psychology, 76, 432-446.
Raju, N. S., & Brand, P. A. (2003). Determining the significance of correlations corrected for unreliability and range restriction. Applied Psychological Measurement, 27, 52-71.
Sackett, P. R., & Yang, H. (2000). Correction for range restriction: An expanded typology. Journal of Applied Psychology, 85, 112-118.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimentation designs for generalized causal inference. Boston: Houghton-Mifflin.
Stauffer, J. M., & Mendoza, J. L. (2001). The proper sequence for correcting correlation coefficients for range restriction and reliability. Psychometrika, 66, 63-68.