Multimodal MR Imaging Model to Predict Tumor Infiltration in Patients with Gliomas

APPENDIX (On-line Only)

APPENDIX 1. STATISTICAL METHODS

Overview: As previously indicated, the goal of our study was to evaluate if data from a combination of advanced MR imaging techniques could be utilized to better predict the tumor infiltration in patients with gliomas, compared to a single imaging technique. A three-step analytical modeling process was undertaken to achieve this goal.

Principal component analyses: Due to the small samples size and the large number of imaging variables, the first step in the model building process was to reduce the dimensionality of the imaging dataset without losing important information about each of the12 imaging variables. This step was accomplished by way of a principalcomponents analysis (PCA). PCA utilizes orthogonal projections of the multivariate data to produce a set of linear independent principalcomposite scores (pc-composite scores), which together capture all of the information associated with the complete multivariate dataset. The number of orthogonal projects is equal to the number of variables in the multivariate dataset. Thus, for the study at hand we used a PCA to generate 12 pc-composite scores per nuclear density measurement. Each pc-composite score was derived as linear combination of the values of the precontrast MPRAGE, T2, FLAIR, DWI, DTI, DSC, PWI, post-contrast MPRAGE, and axial T1 spin echo imaging variables associated with the nuclear density measurement.

Supplemental Table 1. Principal component non standardized coefficients (i.e. the weights give to the individual predictors)

Predictor / PC1 / PC2 / PC3 / PC4 / PC5 / PC6 / PC7 / PC8 / PC9 / PC10 / PC11 / PC12
T1 / 0.05830 / 0.44323 / -0.86202 / 0.23875 / -0.00630 / -0.00117 / 0.00470 / -0.00062 / 0.00000 / 0.00000 / 0.00000 / 0.00000
fa / 0.00082 / -0.00090 / -0.00103 / 0.00421 / 0.01733 / -0.01466 / -0.17951 / 0.98348 / 0.00184 / 0.00022 / -0.00179 / 0.00006
fanum / 0.00000 / 0.00000 / 0.00000 / 0.00001 / 0.00000 / -0.00005 / -0.00057 / 0.00086 / 0.22320 / 0.37934 / 0.82340 / 0.35819
fadenom / 0.00000 / 0.00000 / 0.00000 / 0.00000 / 0.00000 / 0.00004 / -0.00016 / -0.00123 / 0.32903 / 0.77806 / -0.23945 / -0.47857
Mean Diff. / 0.00000 / 0.00000 / 0.00000 / 0.00000 / -0.00001 / 0.00006 / 0.00016 / -0.00117 / 0.08694 / 0.29929 / -0.51017 / 0.80162
T2 / -0.96708 / -0.19246 / -0.13901 / 0.09161 / 0.00199 / 0.00114 / -0.00069 / -0.00006 / 0.00000 / 0.00000 / 0.00000 / 0.00000
K2 / -0.00001 / -0.00001 / -0.00002 / -0.00007 / 0.00010 / 0.00004 / 0.00073 / 0.00187 / -0.91343 / 0.40144 / 0.06639 / -0.00857
rCBF / -0.00681 / 0.01967 / 0.01531 / 0.00064 / -0.04356 / 0.04308 / 0.98131 / 0.18056 / 0.00117 / 0.00004 / 0.00015 / 0.00002
rCBV corr. / -0.19794 / 0.57266 / 0.06273 / -0.78983 / -0.06910 / -0.00826 / -0.01643 / 0.00223 / 0.00004 / -0.00002 / -0.00001 / 0.00000
rCBVuncorr. / -0.14875 / 0.66192 / 0.48231 / 0.55080 / 0.05720 / 0.00823 / -0.01899 / -0.00548 / -0.00007 / 0.00002 / 0.00000 / 0.00000
rMTT / -0.00042 / -0.00030 / 0.01033 / 0.02728 / -0.19516 / -0.97970 / 0.03506 / -0.00487 / -0.00003 / 0.00005 / -0.00005 / 0.00000
TTP / 0.00336 / -0.00584 / 0.02636 / 0.08148 / -0.97551 / 0.19489 / -0.05400 / 0.00991 / -0.00012 / 0.00003 / -0.00001 / -0.00001

Univariate analyses: The second step in the model building process was to identify the set of pc- composite scores (pc-composite scores) that were linearly associated with nuclear density. This step was accomplished by conducting a set of univariate generalized estimating equation regression (GEER) analyses, in which for each of the 12 set of pc-composite scores, nuclear density served as the GEER model response variable and the set of pc-composite scores served as the GEE model predictor variable. With regard to the GEER model specification, the Gaussian distribution was the assumed underlying distribution of the nuclear density measurements, and since each patient had multiple nuclear density measurements, to account for intra-patient measurement correlation in the hypothesis testing process, the GEE model variance-covariance parameter estimates were derived by way of the Huber and White sandwich variance-covariance estimator. With regard to hypothesis testing, a p≤0.05 decision rule was utilized as the criterion for rejecting the null hypothesis that there was no linear association between the pc-composite score and nuclear density measurement. The adequacy of the univariate model to predict nuclear density was assessed via the coefficient of determination (R2).

Supplemental Figure 1.Principal component coefficients for PC1-PC12.

Supplemental Table 2. Univariate generalized estimating equation analysis (modeling only marginal associations).

Predictor / Degrees of Freedom / Chi-Square / P-value / R2
PCS 1 / 1 / 0.16 / 0.688 / 0.01
PCS 2 / 1 / 1.36 / 0.244 / 0.04
PCS 3 / 1 / 0.68 / 0.408 / 0.01
PCS 4 / 1 / 0.78 / 0.376 / 0.01
PCS 5 / 1 / 0.03 / 0.854 / 0.00
PCS 6 / 1 / 5.31 / 0.021 / 0.05
PCS 7 / 1 / 1.16 / 0.281 / 0.05
PCS 8 / 1 / 0.56 / 0.453 / 0.02
PCS 9 / 1 / 0.00 / 0.981 / 0.00
PCS 10 / 1 / 5.75 / 0.016 / 0.08
PCS 11 / 1 / 0.06 / 0.811 / 0.00
PCS 12 / 1 / 3.97 / 0.046 / 0.19

Multivariate analyses: The third and final step in model building process was to utilize the pc-scores that were found to be statistically associated with nuclear density in step 2 of the model building process to build a multivariate model to predict nuclear density. Two multivariate GEER models were examined. The first model included the pc-composite scores that were found to be associated with nuclear density in step 2 of the model building process, while the second model include linear and nonlinear restricted cubic spline functions of the same pc-composite scores. As in the univariate analyses, the Gaussian distribution was the assumed underlying distribution of the nuclear density measurements, and the GEE model variance-covariance parameter estimates were derived by way of the Huber and White sandwich variance-covariance estimator. With regard to hypothesis testing, the GEE modified version of the generalized Wald test was utilized to compare the two models. A p≤0.05 decision rule was utilized as the criterion for rejecting the null hypothesis that the predictive information gained by allowing for non-linear associations between the pc-composite scores and nuclear density was no greater than what would be expected by pure chance. The same hypothesis testing strategy was utilized to determine the composition of the pc-composite scores that were included in the final multivariate GEER model.

The final regression model is described by Equation 1 (Eq. 1), where E(Density |X) denotes the predicted nuclear density given the values of PC10 and PC12, given in Equations 2 and 3, respectively.

where

and

Model adequacy, with respect to the final model’s ability to predict nuclear density, was assessed based on a biased corrected version of the multiple coefficient of determination (R2). The biased correct R2 was estimated via the bootstrap validation function “validate” of the HMISC library of SpotfireSplus version 8.3 (TIBCO Inc., Palo Alto, CA). The biased corrected R2 essentially represents the predicted R2 after subtracting out the optimism in the observed value of R2 induced by the fact that the model parameter estimates were optimized to predict the observed values of the response variable.

Statistical software: The principal components analysis, and the univariate and multivariate generalized estimating equation modeling were conducted utilizing the software of the SpotfireSplus version 8.3 (TIBCO Inc., Palo Alto, CA) statistical package.