Annex 1. Online supplement statistical analysis strategies

Following the guidelines by Singer & Willett (29), unconditional and conditional growth modeling was applied as follows.

Developing a Multilevel Model of Growth

We used individual growth modeling techniques to further analyze these longitudinal data. The multilevel model for change allowed us to simultaneously address two research questions: a level–1 question focused on individual changes over time in growth parameters (within-person change), and a level–2 question was concerned with how this change varied across individuals (between-person change). In this way, we had a two-level model, where the measurement occasions were at level-1, and the children at level-2, giving rise to a two-level hierarchical structure in the model. Hierarchical linear models (HLM)were used to summarize each person's response using a function that includedan overall group effect plus a person-specific component(28).

Unconditional Mean Model (Model 1)

We first examined the unconditional means model to determine whether there was sufficient variability in individuals’ dependent variable, (i.e. Y=β0i+error). Model 1 in Table 3 estimated the mean growth parameters averaged across all children and time points. A significant unconditional mean model suggests that examining predictors in the subsequent model is warranted.

Unconditional Growth Curve Model (Models 2 &3)

Next, the unconditional growth model (UGM) was tested to determine if there was evidence for variability in the children’s height and weight over time. The unconditional linear growth model expresses the outcome variable as a linear function of time. Each child's score was defined by an individual growth trajectory that depended on a unique set of parameters, and was defined as follows (equation 1)

Yit= β0i + β1i(Age) it + ri0 + ri1 (Age) + eit...... 1

This model indicates each child's weight or height as a function of his or her intercept age 3 (β0i), his or her linear growth trajectory (β1i), plus his or her random error as it varies by age (rit) (i.e. an individual’s growth parameters at time t not predicted by age). Model 2 thus directly represents individual change trajectories. Non-linearity of growth models was accounted for by including quadratic terms for child age-squared in the final model (Model 3). This model is given as equation (2) below:

Yit= β0i + β1i(Age) it + β2i(Age2) it + + ri0 + ri1 (Age) + eit...... 2

The Unconditional Growth models provide descriptive information regarding the nature of developmental change in height and weight score that is characteristic of the sample on average, as well the degree to which individuals vary in respect to each of the parameters.

Conditional growth model (Models 4-7)

Once the unconditional growth model had been established for our growth data, we fitted a conditional model to examine the effect of exposure to H. pylori (time-varying) on growth trajectory. In these models, fixed effects ascertain whether the predictors account for individual

differences in each of the growth terms. The random effects assess amount of outcome variability left unexplained in the model. As predictors are added, their impact can be evaluated in terms of how much additional variance is accounted for, i.e., the proportional reduction in unexplained variance, compared to a simpler model.

We added H. pylori (1, positive, 0, negative) as a covariate to predict initial levels of height and weight and to predict increases or decreases in height and weight from 3years to 6.5years. To assess growth differences between the H. pylori positive and H. pylori negative groups, we included an interaction term for age and H. pylori and an interaction term for age-squared and H. pylori in the quadratic model. This model is given as equation (3) below:

Yit= β0i + β1i(Age) it + β2i(Age2) it + ri0 + ri1 (Age) + β3i(H. pylori status) + β4i(H. pylori status X Age) + β5i (H. pylori status X Age2) + eit ...... 3

All adjusted analyses controlled for possible confounding effects of demographic and socio-environmental factors that might relate with childhood growth.supplement

Model fit

To determine the best model for predicting growth, we compared all models: the mean height and weight model (Y=β0+error), linear change in height and weight (Y=β1+age+error) and quadratic change in height and weight (Y=β2+ age+age2+error), where random variation was permitted for the linear age term only. To estimate model fit, we calculated the chi-squared value by subtracting the −2 log likelihood estimates from the subsequent models (i.e. mean level v. linear, and linear v. quadratic). Compared with the mean level model, the addition of the linear term significantly improved the fit of the model (p<0.001). The addition of fixed a quadratic term also significantly improved the model fit compared to the linear model (p<0.001). Results of the model comparisons are summarized in Table 3 in terms of overall model fitting (goodness of fit expressed as −2 Log Likelihood [−2LL] in “smaller-is-better” form), number of parameters, estimates of coefficients for fixed effects, and covariance estimates for random effects, and suggested that the quadratic model provided the best model fit for estimating patterns of growth.

Prototypical plots

Finally, prototypical plots were used to interpret any interactions (29). The full equation resulting from the estimated model was written out, and the values of the predictors were substituted to obtain predicted scores for each combination of predictor values.