Appendix S1: Building an age-structured model

We use a discrete-time model that incorporates both epidemic and demographic transitions (Klepac et al. 2009; Klepac & Caswell 2010) by structuring the population into age classes, and epidemiological classes (‘maternally immune’ M, ‘susceptible' S, ‘infected' I, ‘recovered’ R, and ‘vaccinated’ V, taken to indicate successful vaccination and only applied to susceptible individuals). We can frame the joint processes of aging and epidemiology as a matrix of demographic and epidemiological transitions. We structure the population into age strata (1, 2, …., z; where z is the total number of age strata, here taken as z=22 with yearly strata up to age 12, and two-yearly to age 26, and then decadal time-step from 30 to 50), and epidemiological classes (‘maternally immune’ M, ‘susceptible' S, ‘infected' I, ‘recovered’ R, and ‘vaccinated’ V). Initially ignoring demographic transitions (survival and aging), within each age class a transitions between epidemiological categories occur according to:

where da is the probability of losing maternal immunity, φa is the probability of becoming infected, va is the probability of being vaccinated. The five rows and columns represent the M, S, I, R, and V categories, respectively, and the matrix contents capture transitions between them. The time-step is taken as the approximate generation time of rubella (the latent plus infection period of the infection). The infection probability φ (also called the force of infection, FOI) is a function of n(t), a vector describing the population at time t:

where z is the total number of age classes. The probability of becoming infected is explicitly

where βa,j,t is the rate of transmission between individuals in age classes a and j, and γ captures heterogeneities in mixing not directly modeled (Finkenstadt & Grenfell 2000; Bjørnstad et al. 2002) and the effects of discretization of the underlying continuous time process (Glass et al. 2003). Here we set values of γ =0.97, reflecting values obtained for measles in England and Wales (Bjørnstad et al. 2002). Discrete-time models that do not incorporate this exponent (i.e., γ=1) result in dynamics that are unrealistically prone to frequent extinction. Total population size appears as a denominator of number of infected individuals in each age class since previous experience with rubella indicates that transmission appears to scale in a frequency dependent manner (Metcalf et al. 2011b), see main text.

We construct a matrix A(n(t)) to project the entire population forwards via aging, mortality and infection dynamics, using the equations specified above:

where sa is the probability that an individual in age class a survives to the next time step, and ua is the rate of aging out of age class a. The dynamics of the population are then given by

where n(t) is a vector representing the population at time t, A is the matrix that describes all demographic and epidemiological transitions (see above), and B(t) is a vector representing the number of births at time t:

Initial conditions were taken as values corresponding to the stationary distribution of the model for each set of parameters (obtained by iteration).

Stochasticity and implicit spatial structure

We also extended the framework to be stochastic and to include an immigration rate according to,

where S[A(n(t))] is a vector resulting from the sum of stochastic draws from multinomial distributions defined according to each column of the matrix A defined as above and the number of individuals at t in each category, B is a vector with zeros corresponding to all but the first class, which contains a draw from a random Poisson distribution with mean set to the desired birth rate. Initial conditions were taken as values corresponding to the long-term equilibrium for each set of parameters (obtained by iteration).

Explicit spatial structure

To incorporate spatial structure explicitly, we extend the model by nesting matrices for every patch within a larger matrix framework. For simplicity, we do not track movement of all individuals in the population (likely to be of relatively small magnitude relative to the size of populations), but focus simply on infected individuals. For example, for a population comprised of three patches,

where Al(n(t)) indicates transitions specific to patch l (which might differ demographically, or in amplitude of transmission, etc). We also define Vt, a vector with length the number of rows of P(n(t)) that captures the probability that infected individuals arrive in each patch given the number of infected individuals in all other patches, and the degree of travel between patches. For each age class, in rows not corresponding to infected individuals, Vt contains zero; and in the row corresponding to infected individuals in patch j, Vt contains

(where ci,,j, and other parameters are as defined from eqn 6, main text; and division by the number of age classes z accounts for the fact that we are assuming even structure of immigration over age in the absence of more detailed information). This is repeated for each of the patches in sequence, i.e.,

The birth vector is likewise modified to

where B1,t indicates the numbers of births at time t in patch 1, etc. As above, population dynamics are then given by:

Stochastic dynamics can be captured as detailed above.

Calculating the burden of CRS

To calculate the CRS burden, we used a pattern of fertility over age (taken from http://www.un.org/esa/population/publications/WFD%202008/WP_WFD_2008/Data.html) to define the risk group, i.e., pregnant women in the first 16 weeks of pregnancy. We assume that the probability of CRS following infection during the first 16 weeks is c=0.65 (Vynnycky et al. 2003). Combining this value, c, with the numerically estimated cumulative risk of infection over the past 16 weeks, denoted, with fertility in each age class, fa, we used our model output to calculate the number of CRS cases expected in each time-step for both vaccination and unvaccinated populations. The number of CRS cases per live births over a chosen time-horizon T is calculated according to

,

where numbers of individuals in each epidemiological class are simulated according to methods defined above, and fa is adjusted by the birth rate such that

,

where the first sum is taken over a year, and b indicates the birth rate in thousands per year.

Parametrizing age and time-specific transmission

Since age-specific survival rates and birth rates are known, the key element that requires informing to define the model described above is . For this, we require estimates of seasonal variation in transmission, estimates of the age structure of transmission (i.e., the Who-Acquires-Infection-From-Whom, or WAIFW matrix), and estimates of the overall magnitude of transmission. For the first, we used seasonal fluctuations estimated from the TSIR, assuming for simplicity that seasonality affected all age classes in the same way. Since scaling the magnitude of transmission depends on the structure of contacts across age, we set the mean value of this seasonality to be 1, i.e., from the TSIR estimates of bs, the value that entered the model was

For the second element, we need to define a WAIFW matrix, or estimate of transmission between individuals in age class a and age j in each season, bs(a,j) . Theoretically, this quantity is related to average seasonal transmission according to

where St(a) is the number of susceptible individuals at time t in age class a, and It(j) is the number of infected individuals at time t in age class, and we assumed for simplicity that seasonality affected all age classes in the same way. To define the WAIFW, we took two approaches. First, we tried fitting a parametric form of contact patterns between ages following (Farrington & Whitaker 2005) to estimates of the FOI over age. The Farrington and Whitaker model requires estimation of 4 parameters (Appendix 2). To fit this to the observed age-profile of infection, we first establish the equilibrium force of infection, for every age class a, denoted . This is defined by:

where N(a) is the number of surviving individuals in the population at age a and S0(a) is the equilibrium proportion of susceptible individuals of age a,

These two equations can be solved by iteration (Farrington & Whitaker 2005) for any particular functional form of , allowing appropriate parameter values to be identified by maximizing the binomial likelihood of the observed proportion of remaining susceptibles across age given predictions from the force of infection equation. To speed up calculations, we assumed that the infection process occurs in discrete time, so that all integrals are replaced by sums. The equilibrium equations are solved by iteration until

was smaller than a chosen tolerance level, where m indicates the iteration number. The parametric form (Appendix 2) is rather flexible, allowing a broadly diagonal pattern of contact over age, with more or less spread across ages around this diagonal, and varying positioning of the peak of transmission. However, a disadvantage of this approach is that the complex pattern observed in natural populations of contact between parents and children (Mossong et al. 2008) in addition to contact between same aged individuals is hard to capture parametrically. Accordingly, we also defined an alternative WAIFW matrix following empirically estimated contact patterns from diary studies conducted in Europe (Mossong et al. 2008). Resulting matrices are characterized by a strongly diagonal structure (indicative of considerable mixing within an age class) combined with 'whiskers' of contact between individuals in their early twenties and children (Mossong et al. 2008). We used the entire European data-set, taking the average between the matrix and its mirror image to enforce symmetry in the contact matrix.

Having defined the structure of the WAIFW following one of the two options described above, we can calculate the basic reproductive ratio, R0 (the number of cases that would result from the introduction of a single infected individual into a completely susceptible population), as the dominant eigenvalue of the next generation matrix from the full model described above taken at the disease free equilibrium (Diekmann et al. 1990; Allen & van den Driessche 2008; Klepac & Caswell 2010). For the chosen model structure, R0 is undefined for g<1 so for its evaluation we set g=1. We can then multiply the chosen WAIFW by a scalar to obtain the desired R0 - and here we assumed that R0=6.8 following the estimate of the TSIR (see text). After these manipulations, we obtain a definition for ba,j,t for every time-step,

where indicates the appropriately scaled WAIFW.

Evaluating the model

We evaluated the model by comparison of incidence, age incidence, and observed extinctions. Since there was relatively little to allow distinction between the POLYMOD WAIFW and the smooth fitted WAIFW, and the POLYMOD WAIFW captured the age profile slightly better (Fig. S2), we chose to retain it (note that if the R0 is mis-estimated by the TSIR then this age-matching will be thrown off). All subsequent results and those in the main text are based on this WAIFW.

Under-reporting is always an issue with rubella incidence data, since the rash can be very mild. However, although this leads to an irregular appearance in observed rubella incidence through time relative to the simulations, the simulations and observed incidence are matched in terms of the age of infection and timing of epidemic outbreaks (Fig. S3, S4); and the relationship between the proportion of fade-outs in the observed and simulated time-series is broadly similar, if correction is made for using the reporting rate (Fig. 4).

Appendix S2: The fitted WAIFW

Taking x and y as the age of the susceptible and infected individual respectively, and defining and , the functional form of the contact matrix is

with

and

with , where reflects a gamma density with mean and shape to capture unimodality of contact rates within age; and reflects a beta density to captures change in contact with increasing or declining age. Combining age-incidence data with data on the overall age structured of the population, we fitted an overall scaling parameter k, and shape parameters m, s2 defined by and g defined by .


Figure S1: The relationship between log district population size and log coupling parameter. The dashed line is y=-13.4 +1.62x, p<0.001, r2=0.3; colours are the same as in Figure 1.


Figure S2: The top row shows the POLYMOD derived WAIFW (log scale), showing high transmission among school children (diagonal), and contacts between adults and children (off-diagonals); and, on the right, the observed age profile of incidence (solid line with quantiles shown as dotted lines) and the simulated age profile of incidence with this WAIFW (grey polygon). The bottom row shows the fitted WAIFW (on a log scale) with observed and simulated age profile of incidence, likewise (bottom row).


Figure S3: Comparison of observed (above) and simulated (below) age profile of incidence across the entire country of South Africa; red indicates low incidence, and white indicates the highest incidence. The timing of peaks and the average age profile is similar. The broader representation across ages and between years in the simulations can be attributed to under-reporting. Similar patterns are obtained whether a parametric fitted WAIFW or the POLYMOD WAIFW are used, so results in the main text are based around the POLYMOD WAIFW.


Figure S4: Comparison of observed (top) and simulated (bottom) total incidence in each of the 52 districts; red colours indicate districts with large population sizes, and yellow is districts with the smallest population size.