Recommendations to mitigate potential sources of error in preparation of biomass sorghum samples for compositional analyses used in industrial and forage applications
BioEnergy Research
Matthew B. Whitfield, Mari S. Chinn*, and Matthew W. Veal
*Corresponding Author: Biological and Agricultural Engineering, North Carolina State University, Campus Box 7625, Raleigh, NC, 27695-7625, USA; Ph: 919-515-6744; Fax: 919-515-6719; email:
Online Resource 1:
Covariance Structures of Stalk Position Effects
We elected to treat stalk position as a repeated measure in the expectation that there might be correlations between the variation in the measured characteristics in the samples and the stalk positions from which the samples were taken. Based on previous studies, we expected that characteristics in which we are interested to vary to a certain extent with stalk position. It is reasonable to expect that the conditions in one part of a given stalk may be correlated with the conditions in another position of that stalk. It is also probable that this correlation may be lower the further the two positions are apart. For this reason, we attempted to apply the first-order autoregressive (AR1) and Toeplitz (TOEP) structures to the data; we also applied the unstructured (UN) and compound symmetry (CS) matrices as well to encompass the extremes of constraint. However, given that any variations in maturity between individual stalks will affect the characteristics of some (most likely the upper) portions of the stalk more than others, we also applied heterogeneous variations on the models described (ARH1, TOEPH, and CSH), as well as a first-order antedependence (ANTE1) model[1].
The fit statistics resulting from these models for pith ratio data can be seen in Table OR1-1. Generally speaking, the heterogeneous models performed better than their counterparts. Based on the AIC, AICC, and BIC, the leading models were the AR1, ARH1, and ANTE1, all of which reasonably account for the expected covariance structure. However, the individual fit statistics disagree slightly in contrasting the ARH1 and the ANTE1. Also, from the -2 residual log likelihood, we can see that the lower fit statistics for UNmatrices result from the penalty these statistics apply based on the number of parameters used.
To gain more insight into the relationships between these models, we compared them using likelihood ratio tests[2]. The difference between the -2 residual log likelihood for the two models takes a chi-square distribution with degrees of freedom equal to the difference in the parameters used by the models. A difference that is not significant indicates that there is no evidence that the more general model is needed. Because the models we used are all nested, they could be directly compared. The only result significant at the 5% level was found in comparing the ARH1 with the AR1. Opting for ANTE1 as a result of the 0.096 significance level might be justifiable, but we elected to use the more parsimonious model.
Results and conclusions were similar in the loss on drying (LOD) comparison between the calculated and whole samples and between the pith and rind samples (Tables OR1-2 and OR1-3, respectively). In the case of the pith and rind moisture recovery comparison, there was little evidence suggesting the use of ARH1 over AR1, however there was strong evidence favoring the use of the ANTE1 over either (Table OR1-4), so it was used for that analysis. Considering the strength of the likelihood tests, it is tempting to speculate about the reason for the differences in the covariance matrices between thehygroscopicity analysis and the other analyses; however, the fact that the hygroscopicity test was based on a random, unbalanced subset of the other tests probably makes any firm conclusions impossible. It may be worth noting, simply, that the preference for the ANTE1 model implies that the correlation in the variation of the moisture reabsorption between samples from different stalk positions may depend more on the actual position of the samples (in addition to their distance from one another) than is the case with LOD. This could imply that the relationship between retained moisture after drying at 45°C and recovered moisture after drying at 105°C is not exact (e.g. hysteresis effects), which could be the result of irreversible processes occurring during the 105°C drying (such as caramelization of sugars) that proceed to different extents at different stalk positions.
Table OR1-1—Fit statistics and likelihood ratio tests for stalk position covariance models for pith ratio. The following acronyms represent the different models used: first-order autoregressive (AR1), Toeplitz (TOEP) structures, unstructured matrices (UN) and compound symmetry matrices (CS). Where appropriate, some of the models were run with applied heterogeneous (H) variations.
Stalk Position Covariance Fit Statistics for Pith RatioDescription / CS / AR1 / TOEP / CSH / ARH1 / ANTE1 / TOEPH / UN
-2 Res Log Likelihood / -520.9 / -556.1 / -557.8 / -528.2 / -565.8 / -572.2 / -567.3 / -579.9
AIC (smaller is better) / -514.9 / -550.1 / -545.8 / -514.2 / -551.8 / -552.2 / -547.3 / -547.9
AICC (smaller is better) / -514.8 / -550.0 / -545.5 / -513.7 / -551.4 / -551.3 / -546.4 / -545.7
BIC (smaller is better) / -515.1 / -550.2 / -546.2 / -514.6 / -552.2 / -552.7 / -547.8 / -548.8
Likelihood Ratio Test: ARH1 versusUnstructured
Chi-Square / DF / PrChiSq
14.0575 / 9 / 0.12029
Likelihood Ratio Test: ARH1 versusAR1
Chi-Square / DF / PrChiSq
9.75295 / 4 / 0.044801
Likelihood Ratio Test: ARH1 versusANTE1
Chi-Square / DF / PrChiSq
6.33601 / 3 / 0.096359
Table OR1-2—Fit statistics and likelihood ratio tests for stalk position covariance models for calculated versus whole loss on drying. The following acronyms represent the different models used: first-order autoregressive (AR1), Toeplitz (TOEP) structures, unstructured matrices (UN) and compound symmetry matrices (CS). Where appropriate, some of the models were run with applied heterogeneous (H) variations.
Stalk Position Covariance Fit Statistics for Calculated versus Whole Loss on DryingDescription / CS / AR1 / TOEP / CSH / ARH1 / ANTE1 / TOEPH / UN
-2 Res Log Likelihood / -2605.9 / -2656.6 / -2662.4 / -2618.9 / -2679.4 / -2680.5 / -2685.7 / -2695.6
AIC (smaller is better) / -2599.9 / -2650.6 / -2650.4 / -2604.9 / -2665.4 / -2660.5 / -2665.7 / -2663.6
AICC (smaller is better) / -2599.8 / -2650.6 / -2650.2 / -2604.7 / -2665.2 / -2660.1 / -2665.3 / -2662.5
BIC (smaller is better) / -2600.5 / -2651.3 / -2651.6 / -2606.4 / -2666.9 / -2662.6 / -2667.8 / -2666.9
Likelihood Ratio Test: ARH1 versus Unstructured
Chi-Square / DF / PrChiSq
16.1300 / 9 / 0.064215
Likelihood Ratio Test: ARH1 versusAR1
Chi-Square / DF / PrChiSq
22.8065 / 4 / .000138411
Likelihood Ratio Test: ARH1 versusTOEPH
Chi-Square / DF / PrChiSq
6.25797 / 3 / 0.099712
Likelihood Ratio Test: ARH1 versusANTE1
Chi-Square / DF / PrChiSq
1.08785 / 3 / 0.78001
Table OR1-3—Fit statistics and likelihood ratio tests for stalk position covariance models for pith versus rind loss on drying. The following acronyms represent the different models used: first-order autoregressive (AR1), Toeplitz (TOEP) structures, unstructured matrices (UN) and compound symmetry matrices (CS). Where appropriate, some of the models were run with applied heterogeneous (H) variations.
Stalk Position Covariance Fit Statistics for Pith versus Rind Loss on DryingDescription / CS / AR1 / TOEP / CSH / ARH1 / ANTE1 / TOEPH / UN
-2 Res Log Likelihood / -2605.9 / -2656.6 / -2662.4 / -2618.9 / -2679.4 / -2680.5 / -2685.7 / -2695.6
AIC (smaller is better) / -2599.9 / -2650.6 / -2650.4 / -2604.9 / -2665.4 / -2660.5 / -2665.7 / -2663.6
AICC (smaller is better) / -2599.8 / -2650.6 / -2650.2 / -2604.7 / -2665.2 / -2660.1 / -2665.3 / -2662.5
BIC (smaller is better) / -2600.5 / -2651.3 / -2651.6 / -2606.4 / -2666.9 / -2662.6 / -2667.8 / -2666.9
Likelihood Ratio Test: ARH1 versusUnstructured
Chi-Square / DF / PrChiSq
16.1300 / 9 / 0.064215
Likelihood Ratio Test: ARH1 versusAR1
Chi-Square / DF / PrChiSq
22.8065 / 4 / .000138411
Likelihood Ratio Test: ARH1 versusTOEPH
Chi-Square / DF / PrChiSq
6.25797 / 3 / 0.099712
Likelihood Ratio Test: ARH1 versusANTE1
Chi-Square / DF / PrChiSq
1.08785 / 3 / 0.78001
Table OR1-4—Fitstatistics and likelihood ratio tests for stalk position covariance models for pith versus rind moisture recovery. The following acronyms represent the different models used: first-order autoregressive (AR1), Toeplitz (TOEP) structures, unstructured matrices (UN) and compound symmetry matrices (CS). Where appropriate, some of the models were run with applied heterogeneous (H) variations.
Stalk Position Covariance Fit Statistics for Pith versus Rind Moisture RecoveryDescription / CS / AR1 / TOEP / CSH / ARH1 / ANTE1 / TOEPH / UN
-2 Res Log Likelihood / 1786.7 / 1683.3 / 1680.8 / 1777.1 / 1677.9 / 1659.3 / 1675.9 / 1650.6
AIC (smaller is better) / 1792.7 / 1689.3 / 1692.8 / 1791.1 / 1691.9 / 1679.3 / 1695.9 / 1682.6
AICC (smaller is better) / 1792.7 / 1689.4 / 1692.9 / 1791.4 / 1692.2 / 1679.7 / 1696.3 / 1683.7
BIC (smaller is better) / 1792.9 / 1689.6 / 1693.2 / 1791.7 / 1692.5 / 1680.1 / 1696.7 / 1683.8
Likelihood Ratio Test: ARH1 versusAR1
Chi-Square / DF / PrChiSq
5.40569 / 4 / 0.24814
Likelihood Ratio Test: ARH1 versusANTE1
Chi-Square / DF / PrChiSq
18.6781 / 3 / .000318659
Likelihood Ratio Test: AR1 versusANTE1
Chi-Square / DF / PrChiSq
24.0838 / 7 / .001101340
Likelihood Ratio Test: ANTE1 versusUnstructured
Chi-Square / DF / PrChiSq
8.68486 / 6 / 0.19209
References
1. Wolfinger RD (1996) Heterogeneous Variance: Covariance Structures for Repeated Measures. J Agric Biol Environ Stat 1:205–230. doi: 10.2307/1400366
2. 37107 - Comparing covariance structures in PROC MIXED. Accessed 5 Feb 2014
1