1
Hall et al. Supplementary Material
Raman Analysis
Raman scattering was excited using a frequency doubled, diode pumped Nd:YAG laser with an excitation wavelength of 532.09 nm and a laser power of approximately 5-8 mW by using a density filter with an attenuation factor of 4. The background signal of the carrier material could be reduced by adjusting the pinhole of the confocal microscope to 250 µm and the slit size to 1000 µm. For the acquisition of the spectra a grating of 600 lines mm-1 (blazed at 500 nm) was utilized while recording with an accumulation time of 40 s to improve the signal-to-noise ratio. The laser beam was centered visually on each cell and focused manually using a 100x/0.90 air objective. The Raman signal was collected in a spectral range of 400-2030 cm-1 but was reduced to an interval of 440 – 1780 cm-1 with an achieved average spectral resolution of 1.6 cm-1. We recorded and preprocessed spectra using automatic baseline correction and normalization algorithms provided with Labspec Raman MS software (Version 5.25.15, HORIBA Jobin-Yvon, UK).
Multi-variate analysis
Peak heights were standardized (i.e. normalized by subtraction of mean peak height and then divided by standard deviation), to remove effects of scale and give all peaks the same weight in the analyses. No multivariate outliers were identified by Mahalanobis distances and conservative Chi-square tests (Tabachnick and Fidell 2007). Standardized peak heights were input to a Principal Component Analysis (PCA) and a Canonical Discriminant Analysis (CDA). PCA allowed an unconstrained assessment of variation in the data, while the CDA added the view of a constrained ordination in a hypothesis-testing framework assessing multivariate differentiation between a priori specified groups. Due to heterogeneity of the variance-covariance matrices we used non-parametric permutational MANOVA (Anderson 2001) to support the CDA analysis (see below). For the CDA, by collapsing the 4-level factor resource N:P into a 2-level factor by combining the 2 lower (0.5 and 5) and the 2 higher (50 and 500) resource treatments. This was supported by the results of the PCA suggesting the formation of two resource N:P groups (upper two resource N:P levels vs. lower two) rather than a continuous response along a resource N:P gradient. The CDA-design was performed on a full factorial design defined by 2 species x 2 growth phases x 2 resource quantities x 2 resource qualities (as N:P levels). Leave-1-out cross validation (Anderson and Willis 2003) was used to generate unbiased estimates of classification success and group distinctness in multivariate space. We computed classification success separately for species, growth phase, resource quantity and resource N:P level and for the overall CDA and individual CDA-axes to assess variation in the best discriminators (i.e. the Raman peaks) among levels of the various factors. Significance values for the classification statistics were obtained by permutational analyses (1000 permutations of the raw data). This allowed us to evaluate which peaks contributed to differences between each of the treatment groups. Since directions of differences in multivariate space are not necessarily identical to directions of greatest variation, CDA can uncover patterns with reference to relevant (and testable) hypotheses (see e.g. Anderson and Willis 2003).
To assess the variation of macromolecular composition among cells from a given treatment, we computed Euclidean distances (Legendre and Legendre 1998) to a multivariate mean (the centroid) based on the 15 standardized Raman peaks. Permutational tests were then used to compare these measures of dispersion (i.e. multivariate variation) among treatments (Anderson 2006). Euclidean distances among cells also served as input for non-parametric permutational MANOVA (i.e. “PERMANOVA”) which assesses between-group distances over within-group distances by computation of pseudo-F-values and permutational assessment of significance (Anderson 2001; McArdle and Anderson 2001). This approach avoids the classical constraints associated with parametrical procedures (e.g. distributional requirements of MANOVA), and is still fully compliant with PCA and CDA which preserve Euclidean distances among data points (Legendre and Legendre 2003).
Since directions of differences in multivariate space are not necessarily identical to directions of greatest variation, we used CDA to uncover patterns with reference to relevant (and testable) hypotheses (see e.g. Anderson and Willis 2003). For this, CDA is performed after a single observation is removed from the data. The "left out" observation is then classified to a group using the canonical equations determined by all other observations, and its classification is compared to its known group membership. This is repeated for all observations which allows for calculation of overall classification success.
References for supplementary information
Anderson, M. J. (2006). "Distance-based tests for homogeneity of multivariate dispersions." Biometrics 62: 245-253.
Anderson, M. J. (2001). "A new method for non-parametric multivariate analysis of variance." Austral Ecol. 26: 32-46.
Legendre, L. and P. Legendre (2003). Numerical Ecology. Amsterdam, Elsevier Science.
McArdle, B. H. and M. J. Anderson (2001). "Fitting multivariate models to community data: a comment on distance-based redundancy analysis." Ecology 82: 290-297.
Tabachnick, B. G. a. F., L.S. (2007). Using Multivariate Statistics. Boston, Allyn and Bacon.
Supplementary Figure Legends
Figure S1 - The classification success following a leave-one-out-cross-validation scheme when using individual singular axes of the canonical discriminant analysis for discrimination between species, growth phases, levels of resource quantity and levels of resource N:P. Only the first 7 CDA axes are shown, further axes – though significant up to axis 13 – have negligible discriminatory power for single factors. While CDA axis 1 discriminates strongly between species, CDA axis 2 and 3 separate resource N:P levels and growth phases, respectively. No CDA axis specifically separates levels of resource quality. Full symbols indicate significantly increased classification success as determined by permutational analyses, empty symbols indicate classification success not different from expected success upon random group allocation.