Electronic supplement:

The distribution of pin-point plant cover data

The measurement of pin-point cover data is a binomial process where a single pin in the pin-point frame either hit or does not hit a specific species. Furthermore, two important characteristics of the typical distribution of plant species has to be taken into account when describing the distribution of pin-point cover data: i) in some sites a specific plant species may be totally absent due to random extinction events and/or limited possibility of the plant to colonise the habitat (MacArthur and Wilson, 1967, Reesat al. 2001, Leibold at al. 2004, Cordonnier at al. 2006)and the pin-point cover data will, therefore, be zero-inflated, ii) if a plant species is present at the sites, the abundance of different plant species generally displays aggregated spatial patterns within the site due to e.g. the size of the plant, clonal growth, and limited seed dispersal, and plant cover data will typically be over-dispersed relative to random expectations (Pacala and Levin, 1997, Herbenat al. 2000, Stoll and Weiner, 2000). Consequently, the distribution of the pin-point plant cover were estimated using a zero-inflated generalised binomial distribution(Damgaard, 2008, Damgaard, 2009),

(S1),

where the parameters are explained in Table S1, and  is the Pochhammer function, . The mean is independent of  and, thus, equal to the mean of the uncorrelated zero-inflated binomial distribution; . The variance, however, does depend on the intra-class correlation parameter ; , and, more specifically, if , then the variance of the number of hits will be augmented relative to the binomial distribution.

Table S1. The parameters in the zero-inflated generalised binomial distribution.

Parameter / Description
y / Number of times a specific plant species is hit in a pin point frame.
n / Number of grid points in the grid point frame (n = 16).
p / Probability that the plant species is absent from a site. The parameter is estimated as: number of sampled plots from a specific community type from stations where the species has not been found/ number of sampled plots from a specific community type
q / Expected plant cover in a specific community type if the species is found at the site
 / Intra-plot correlation of pin-point hits. For plant species with an average large cover or plant species that tend to be spatially aggregated, e.g. clonal plants, the number of hits within a pin-point frame are positively correlated. The hypothesis of no correlation (binomial distributed hits) may be tested in a likelihood ratio test by setting  = 0.

Regression of pin-point plant cover to an independent variable

The effect of an environmental gradient, x, on the cover of a specific plant species was modelled by fitting the probability qin (S1) withfour different functions of x(Damgaard, 2006, Damgaard, 2008). The four different response functions may be described as either M1) constant, M2) monotonically increasing or decreasing, M3) monotonically increasing or decreasing sigmoid function with a threshold level, or M4) optimum at an intermediary level of x.

The functional relationships of the sigmoid model (M3) and the optimum model (M4) are expressed using a generalised standardised sigmoid function that, unlike the commonly used sigmoid functions, returns values in the interval between zero and one and is therefore well-suited for modelling probabilities:

(S2),

where x0 is the point of inflection ,. Thus, x0 is the parameter that measures the level of the environmental gradient, where the largest change in h (x) occurs and can be interpreted as the threshold level of the environment. The function, h (x), has the property that and . If ad, h (x) is a strictly decreasing function, and if ad, h (x) is a strictly increasing function. If b = 0 or a = d, then h (x) = a. If then the curve is sigmoid and if , then the curve is either convex or concave (Damgaard, 2008).

With the use of the standardised sigmoid function (S2) the four different models may be expressed as:

M1.,

M2. (S3)

M3.

M4.

Various hypotheses on the parameters in the models can be tested using likelihood ratio tests, e.g. a prior hypothesis on a possible threshold level of the studied environmental component that controls the presence of the specific species. Alternatively, the Bayesian joint posterior density of the parameters may be calculated using Markov chain Monte Carlo (MCMC) methods (Metropolis-Hastings algorithm) (Carlin and Louis, 1996).

A critical phase when selecting among the three complementary models is to decide on the number of free parameters in the models. Due to correlation among the parameters the “effective number of free parameters” generally is less than the actual number of parameters (Spiegelhalterat al. 2002, Damgaard, 2006). The “effective” number of free parameters may be estimated from the deviance of MCMC samples of the three models, as the mean of the sampled deviance minus the deviance of the mean of the parameters under certain assumptions (Spiegelhalter at al. 2002).

The above models resembles a hierarchical system of models proposed by (Huismanat al. 1993). These models, called HOF models after the three authors, also display different shapes depending on the model complexity, the most complex model being . The main difference being that the HOF models do not include the threshold level(s) as an explicit parameter.

Trend analysis of pin-point plant cover

In order to separate the observed variance of the longitudinal pin-point plant cover data into sampling variance and the more interesting variance of the annual change in plant cover, the change in cover is analyzed using a state-space model, where the unknown mean cover at each site is modelled by a latent variable (Clark, 2007, Bruusat al. 2010, Damgaardat al. 2011). The reasons for modelling the change in plant cover at the level of the site rather than at the level of the plots are that i) the pin-point data are hierarchical with several plots within a site and where the different sites have separate histories (e.g. different management practices), and environmental conditions (e.g. climate and nitrogen deposition); ii) the data set is not complete and many plots are not sampled each year; iii) the position of the individual plots is known with GPS-certainty (about 1m).

A state-space model consists of a structural equation, where the processes that control the change in the unknown mean cover over time are described, and a measurement equation, where the observations or measurements are coupled to the unknown mean cover at the observation time. The structural equation consists of a deterministic part that describes the average change in cover and a stochastic part which describes the variation in the annual change. The unknown cover at site i to the year t is denoted xi,t, and it is assumed that the average change in the logit-transformed cover () is constant with independent and identically normally distributed variance :

,where (S4).

Since only sites where the species was included in the trendanalysis it is assumed that the number of pin-point hits are independent and identically distributed from a zero-inflated generalised binomial distribution(S1) with p = 0 in the measurement equation, and in order to account for the expected auto-correlation among the observations from the same plot, the probability of observing y hits is modelled by , where is a bounding function between 0 and 1, and  is the auto-correlation parameter.

References

Bruus, M., K.E. Nielsen, C. Damgaard, B.Nygaard, J. Fredshavn, and R. Ejrnæs. 2010. Terrestrial naturtypes 2008, In NERI technical report No. 765 . DMU, Aarhus University. (In Danish).

Carlin, B.P.,andT.A. Louis. 1996.Bayes and empirical Bayes methods for data analysis. Chapman & Hall, London.

Clark, J.S. 2007.Models for ecological data. Princeton University Press, Princeton.

Cordonnier, T., B. Courbaud, and A. Franc. 2006. The effect of colonization and competition processes on the relation between disturbance and diversity in plant communities. Journal of Theoretical Biology243: 1-12.

Damgaard, C. 2006. Modelling ecological absence-presence data along an environmental gradient: threshold levels of the environment. Environmental and Ecological Statistics13: 229-236.

Damgaard, C. 2008. Modelling pin-point plant cover data along an environmental gradient. Ecological Modelling214: 404-410.

Damgaard, C. 2009. On the distribution of plant abundance data. Ecological Informatics4: 76–82.

Damgaard, C., B. Nygaard, R. Ejrnæs, and J. Kollmann. 2011. State-space modeling indicates rapid invasion of an alien shrub in coastal dunes. Journal of Coastal Research27: 595–599.

Herben, T., H.J. During, and R. Law. 2000. Statio-temporal patterns in grassland communities. The geometry of ecological interactions: Simplifying spatial complexity (eds U. Dieckmann, R. Law & J. A. J. Metz), pp. 48-64. Cambridge University Press, Cambridge.

Huisman, J., H.Olff, and L.F.M. Fresco. 1993. A hierarchical set of models for species response analysis. Journal of Vegetation Science4: 37-46.

Leibold, M.A., M. Holyoak, N. Mouquet, P. Amarasekare, J.M. Chase, M.F. Hoopes, R.D. Holt, J.B. Shurin, et al. 2004. The metacommunity concept: a framework for multi-scale community ecology. Ecology Letters7:601-613.

MacArthur, R.H.,andE.O. Wilson. 1967.The theory of island biogeography. Princeton University Press, Princeton.

Pacala, S., andS.A. Levin. 1997. Biological generated spatial pattern and the coexsistence of competing species. Spatial ecology. The role of space in population dynamics and interspecific interactions (eds D. Tilman & P. Kareiva). Princeton University Press, Princeton.

Rees, M., R. Condit, M. Crawley, S. Pacala, and D. Tilman. 2001. Long-term studies of vegetation dynamics. Science293: 650-655.

Spiegelhalter, D.J., N.G. Best, B.P. Carlin, and A. van der Linde.2002. Bayesian measures of model complexity and fit. Journal of Royal Statistical Society B64: 583-639.

Stoll, P. and J. Weiner. 2000. A neighborhood view of interactions among individual plants. The geometry of ecological interactions (eds U. Dieckmann, R. Law & J. A. J. Metz), pp. 11-27. Cambridge University Press, Cambridge.