Supplemental information
Study / Grouping of tests by Francis / Studies contain evidential value / Studies’ evidential value, if any, is inadequate. / Studies exhibit evidence of intense p-hacking / Average power of tests included in p-curveAlter, & Oppenheimer (2009) / 0 / 0.7034 / 0.0526 / 0.2966 / < 5 %
Ashton-James, et al. (2009) / 1 / 0.0079 / 0.7207 / 0.9921 / 34%
Fast, N. J., & Chen, S. (2009) / 1 / 0.6507 / 0.0752 / 0.3493 / < 5 %
Fast, et al. (2009) / 1 / 0.7401 / 0.0278 / 0.2599 / < 5 %
Garcia, S. M., & Tor, A. (2009) / 1 / 0.3270 / 0.1790 / 0.6730 / 8%
González, J., & McLennan, C. T. (2009) / 1 / 0.1047 / 0.5306 / 0.8953 / 37%
Hahn, et al. (2009) / 0 / 0.0232 / 0.7870 / 0.9768 / 65%
Hart, W., & Albarracín, D. (2009) / 1 / 0.9099 / 0.0085 / 0.0901 / < 5 %
Janssen, N., & Caramazza, A. (2009) / 1 / 0.4195 / 0.3815 / 0.5805 / 10%
Jostmann, et al. (2009) / 0 / 0.8639 / 0.0218 / 0.1361 / < 5 %
Labroo, et al. (2009) / 1 / 0.0088 / 0.7993 / 0.9912 / 48%
Nordgren, et al. (2009) / 1 / 0.5414 / 0.1947 / 0.4586 / < 5 %
Wakslak, & Trope, (2009) / 0 / 0.7784 / 0.0418 / 0.2216 / < 5 %
Zhou, et al. (2009) / 1 / 0.0002 / 0.9298 / 0.9998 / 34%
Balcetis, E., & Dunning, D. (2010) / 0 / 0.5854 / 0.0794 / 0.4146 / < 5 %
Bowles, H. R., & Gelfand, M. (2010) / 1 / 0.3118 / 0.1697 / 0.6882 / 21%
Damisch, et al. (2010) / 1 / 0.9312 / 0.0095 / 0.0688 / < 5 %
de Hevia, M. D., & Spelke, E. S. (2010) / 1 / 0.5251 / 0.1089 / 0.4749 / < 5 %
Ersner-Hershfield, H.et al. (2010) / 0 / 0.9126 / 0.0255 / 0.0874 / < 5 %
Gao, T., et al. (2010) / 1 / 0.0538 / 0.6505 / 0.9462 / 40%
Lammers, et al. (2010) / 1 / 0.7152 / 0.0766 / 0.2848 / < 5 %
Li, X., et al.(2010) / 1 / 0.1450 / 0.3054 / 0.8550 / 23%
Maddux, W. W.et al. (2010) / 1 / 0.0000 / 0.9998 / 1.0000 / 77%
McGraw, A. P., & Warren, C. (2010) / 1 / 0.1951 / 0.2445 / 0.8049 / < 5 %
Sackett, et al. (2010) / 1 / 0.2818 / 0.1445 / 0.7182 / 13%
Savani, et al.(2010) / 1 / 0.0807 / 0.5426 / 0.9193 / 18%
Senay,et al. (2010) / 1 / 0.9943 / 0.0001 / 0.0057 / < 5 %
West, et al. (2010) / 1 / 0.0146 / 0.7485 / 0.9854 / 57%
Evans, et al. (2011) / 0 / 0.0000 / 1.0000 / 1.0000 / 99%
Inesi, et al (2011) / 1 / 0.1278 / 0.3234 / 0.8722 / 18%
Nordgren, et al. (2011) / 1 / 0.2687 / 0.2105 / 0.7313 / 17%
Savani, et al. (2011) / 1 / 0.3629 / 0.2314 / 0.6371 / < 5 %
Todd, et al (2011) / 1 / 0.5286 / 0.0672 / 0.4714 / < 5 %
Tuk, et al (2011) / 1 / 0.8974 / 0.0142 / 0.1026 / < 5 %
Anderson, et al. (2012) / 1 / 0.2597 / 0.2882 / 0.7403 / 10%
Bauer, et al (2012) / 1 / 0.7018 / 0.0611 / 0.2982 / < 5 %
Birtel, et al. (2012) / 1 / 0.0017 / 0.9256 / 0.9983 / 70%
Converse, et al (2012) / 0 / 0.8583 / 0.0109 / 0.1417 / < 5 %
Converse, & Fishbach (2012) / 1 / 0.1536 / 0.3603 / 0.8464 / 25%
Keysar, et al. (2012) / 1 / 0.0578 / 0.4925 / 0.9422 / 13%
Leung, et al. (2012) / 1 / 0.9160 / 0.0049 / 0.0840 / < 5 %
Rounding, et al (2012) / 0 / 0.8157 / 0.0233 / 0.1843 / < 5 %
Savani, & Rattan (2012) / 0 / 0.4553 / 0.1085 / 0.5447 / 10%
van Boxtel, & Koch (2012) / 1 / 0.0000 / 0.9999 / 1.0000 / 95%
Table S1. A re-analysis of the original studies analyzed by Francis (2014).
Column 1 - the studies reported by Frances (2014). Column 2 – could Francis have chosen different significance tests to calculate the power of the reported experiments? [0=no, 1=yes]. Column 3-5 - p-values obtained by the p-curve analysis (Simonsohn 2013) for the hypotheses that the studies contain evidential value (column 3), inadequate evidential value (column 4), and evidence of intense p-hacking (column 5). Column 6 – The aggregate power as calculated by the p-curve analysis. We only included the significance tests selected by Francis. In some analyses, Francis included F-values from interaction-terms as well as t-tests on the main effects. We only took the F-values when attenuations were predicted, or the t-tests when reversals were predicted, as per the p-curve manual. Tests with p<0.05 are highlighted.