ELECTRONIC SUPPLEMENTARY MATERIAL

Online Resource 1: Preparation of species distribution maps

We first created a geographic layer of points by choosing the centre of the estuaries and lagoons where mangroves were recorded in 2011. Species composition and forest patch size were added as attributes. Second, we generated polygon buffers around each point, the surfaces of which were proportional to the forest patch size, with a radius ranging from 1 km to 10 km. This was used as an estimate of the surface covered by the mangrove forests. We then derived raster layers with a 30 arc seconds spatial resolution of the distribution of the mangrove forest and the three species from the polygon layers. Finally, we corrected for elevation with an overlay between the raster layers of mangrove forest and species occurrences and a Digital Elevation Model (DEM, source: WorldClim, Hijmans et al., 2005) by selecting cells of the DEM with values < 20 m above mean sea level.

We derived absences by selecting estuaries rather than the whole coastline. This should prevent us from, during the modelling procedure, giving weight to absences that are not suitable by wave action or coast geomorphology for mangrove species. There are 199 estuaries along the east coast of South Africa from the border with Mozambique to Gamtoos, ±150 km south of the current mangrove latitudinal limit (Source: Consortium for Estuarine Research and Management of South Africa (CERM), www.nmmu.ac.za/cerm/images/Estuaries2/, compiled by Alan Whitfield in 2003, accessed on February 1st, 2011). To extract these estuaries, we followed a similar approach as when extracting the mangrove forests. But in this case, we used the same buffer (radius of 1 km) for all estuaries. Finally, we combined the mangrove occurrence with the absence raster layers to create a dataset for both calibration and projection of SDMs. In total, our data set contained 1106 pixels.

Online Resource 2: Environmental variable preparation and selection

We selected the 3 variables out of 21 environmental variables: the 19 bioclim variables (Hijmans et al., 2005) and growing degree days (GDD) and water balance (WBAL). We calculated GDD and WBAL:

GDD: (1) Monthly GDD = (Tmean – Tbase)*NoDays if Tmean > Tbase, else 0

Tbase = 18°C, NoDays = number of days in that month

(2) GDD = sum (monthly GDD);

WBAL = sum (monthly P – monthly PET); monthly PET = 58.93/12* monthly Tmean.

The relationships between the 21 variables were first visually assessed with a PCA correlation circle (Fig. Online Resource 2a) and then quantitatively assessed with pair-wise correlations using Spearman’s rho correlation coefficient (Table Online Resource 2). Only predictors with correlation lower than 0.7 were considered (Elith et al., 2006; Wisz & Guisan, 2009). We finally selected the variables among the least correlated with the most direct (sensu Austin & Smith, 1989; see also Guisan & Zimmermann, 2000) impact on mangrove species’ physiology, in particular variables that can potentially control for their leading edge.

The three predictor variables selected are (1) monthly minimum temperature of the coldest month (TMIN), (2) GDD and (3) WBAL. TMIN stands for extreme low winter temperatures that can cause direct damage to mangrove trees by chilling or freezing (Sakai & Larcher, 1987). GDD is a proxy for the period the species can grow and complete its life cycle. WBAL is a proxy for relative humidity: low relative humidity causes similar water stress than high soil salinity. Correlation between those three variables is low (≤0.5), but each of these variables is highly correlated (>0.7) with some other variables: TMIN with precipitation (P) of the wettest month, P of the wettest quarter and P of the warmest quarter; GDD with annual mean temperature (T), diurnal T range, T seasonality, max. T of the warmest month, yearly T range, T of the wettest quarter, T of the warmest quarter and T of the coldest quarter; WBAL with annual P (Table Online Resource 2, Fig. Online Resource 2a). So, the three ecological meaningful variables we selected cover a wide range of the variation in the climatological data of the study area.

Table Online Resource 2: Correlations between the 21 climate variables, derived from Worldclim (Hijmans et al., 2005). Bio1: Annual Mean Temperature, bio2: Mean Diurnal Range, bio3: Isothermality, bio4: Temperature Seasonality, bio5: Max Temperature of Warmest Month, bio6: Min Temperature of Coldest Month, bio7: Temperature Annual Range, bio8: Mean Temperature of Wettest Quarter, bio9: Mean Temperature of Driest Quarter, bio10: Mean Temperature of Warmest Quarter, bio11: Mean Temperature of Coldest Quarter, bio12: Annual Precipitation, bio13: Precipitation of Wettest Month, bio14: Precipitation of Driest Month, bio15: Precipitation Seasonality, bio16: Precipitation of Wettest Quarter, bio17: Precipitation of Driest Quarter, bio18: Precipitation of Warmest Quarter, bio19: Precipitation of Coldest Quarter, Gdd: growing degree days, wbal: water balance. The numbers marked in yellow are highly correlated (>0.7) (next page)

Figure Online Resource 2b: Frequency distribution of daily mean temperature records (period 1960 to 2010) of East London (Nahoon estuary) and Durban. Long-term yearly mean temperature from Worldclim (red line) and long-term yearly mean value from daily records (blue line) are represented for these two estuaries. The Worldclim dataset reflects perfectly the mean conditions of temperature.


Online Resource 3: Technical details of the modelling techniques

In both GLMs and GAMs, an Akaike information criterion (AIC)-based stepwise procedure in both directions was used to select the most significant predictors (Akaike, 1974). Up to third-order polynomials were allowed for each predictor in GLMs and up to three degrees of freedom were allowed for the smoothing spline functions in GAMs. For GBMs, we used the algorithm as implemented in the R package “gbm: Generalized Boosted Regression Models” available on the R website (http://cran.r-project.org). GBMs were calibrated with a maximum number of trees set to 2000, fivefold cross-validation procedure to select the optimal numbers of trees to be kept and a value of seven as maximum depth of variable interactions.

Online Resource 4: Technical details of the area under the curve (AUC) evaluation

The AUC evaluation value varies from 0.5 for a model the predictions of which are no better than random, to 1 for a model achieving perfect agreement with the observed data. This data-splitting procedure for calibration and evaluation was repeated 100 times and the evaluation metrics averaged. Note that while model evaluation was carried out using the above-mentioned data-splitting procedure, the final models used for spatial projections were calibrated using 100% of the data, thereby taking advantage of all available data.

Online Resource 5

Table Online Resource 5: Mean and standard deviation for the area under the curve (AUC) of a receiver–operating characteristic (ROC) plot calculated by cross-validation for the three SDM techniques (GLM, GAM and GBM) and the two types of models (CLIM: climate variables only; +HP; climate models including the human perturbation variable). AMAR: A. marina, BGYM: B. gymnorrhiza, RMUC: R. mucronata, MANG: mangrove forest.

Online Resource 6

Table Online Resource 6: Predictions of models for all estuaries within the observed range. Oberved versus predicted occurrences for the three SDM techniques (GLM, GAM and GBM) and the three types of models (CLIM: climate variables only; +HP; climate models including the human perturbation variable; +RMC; climate models including the geomorphic perturbation variable). AMAR: A. marina, BGYM: B. gymnorrhiza, RMUC: R. mucronata, MANG: mangrove forest.