Supporting Online Material s1

SUPPORTING ONLINE MATERIAL

ESTIMATING THE MULTIVARIATE MODEL IN SAS NLMIXED

This multivariate analysis is easily accomplished with the SAS procedure NLMIXED (SAS Institute Inc.: Cary, NC, 2011). Likelihoods, and , and link functions, and , are specified for each outcome, Y and M, respectively. Because the Y and M are conditionally independent, their log likelihoods can simply be added to determine the multivariate generalized linear mode log likelihood. This software can accommodate any user specified likelihoods and corresponding link functions. Since all parameters are now estimated from a single model, the direct and indirect effects can be simply obtained via Estimate statements, which estimate and test linear and nonlinear functions of the parameters. Moreover, complications arising from the use of different SAS procedures can be avoided, for example, use of restricted maximum likelihood (often used for procedures appropriate for continuous outcomes) versus maximum likelihood (often used for procedures appropriate for dichotomous and count outcomes) for computations. With the multivariate approach, the NLMIXED Estimate statement can test with the classical Wald test using an approximate delta method standard error, reproducing estimates from the classical separate univariate regression approach. However, one can now also test the null hypothesis thatwith a likelihood based test. Also, confidence ellipses can be generated using estimates of ,and their variances output from the multivariate model and input into a SAS MACRO implementing an IML routine.

ADDITIONAL DETAILS FOR DESCRIPTION OF THE CONFIDENCE ELLIPSE

For further ease of presentation, we temporarily assume that there is no interaction and construct a confidence ellipse for and . To construct a confidence ellipse for and appropriate when interaction is present, we will simply substitute for . According to Scheffé (1), assuming approximate normality of , an approximate 100 (1-α)% confidence ellipse for is provided by the set of points satisfyingwhere the inverse of the variance-covariance matrix of is , since and are conditionally independent for the proposed mediation analysis. F1-α,2,v is the 100(1-α) percentage point of an F distribution with 2 and v degrees of freedom. Since maximum likelihood estimates of V11 and V22 are used, a reasonable choice for v is v = n, the number of subjects, although one could also asymptotically justify in which case , the 100(1-α) percentage point of a chi-squared distribution with 2 degrees of freedom as suggested by Young et al.(2)

Abandoning the matrix notation, the confidence ellipse may also be written as the confidence circle where and . This inequality is quadratic in y, i.e., , where , , , and , generally with two solutions,. There is only one solution for y when , i.e. when . Then and . This is the projection of the ellipse on the axis. It can similarly be shown that is the projection of the ellipse on the axis. These two simultaneous projections are well known as Scheffé’s simultaneous confidence limits for and . The projections define a rectangle that circumscribes the confidence ellipse.

For a given value of x between, , and hence between , there are two solutions for y, , and hence two solutions for , and . Evaluation of and for a grid of ’s provides useful plot points for the ellipse.

Not only does the ellipse constrain and to be within their confidence limits, it also constrains functions of and , for example the product , to be within their confidence limits. For a given value of between and , . Multiplying both sides of the inequality by yields the following limits,

The evaluation of and for a grid of ’s provides plot points for the simultaneous confidence region of and induced by the ellipse (Figure 3b). The and across all between and are the projections of this confidence region on the axis. These are simultaneous confidence limits for the product .

SAS CODE for Confidence Ellipse macro for when mediator and outcome use identity links (also available at https://github.com/wagnerbd/Confidence-Ellipse-for-Mediation)

%macro confidence_ellipse(AdditionalEstimates, CovMatAddEst, alpha, outData);

proc iml worksize=600; reset fw=7;

start init;

use &AdditionalEstimates.;

read all var {Label} into Label; read all var {Estimate} into Estimate;

close &AdditionalEstimates.;

use &CovMatAddEst.; Cov=repeat(0,2,2);

read all var {Label} into Label;

read all var {Cov1} into Cov1;

read all var {Cov2} into Cov2;

Cov[1,1]=Cov1[1]; Cov[1,2]=Cov2[1];

Cov[2,1]=Cov1[2]; Cov[2,2]=Cov2[2];

close &CovMatAddEst.;

print Label Estimate Cov;

finish;

start ellipse;

F=Finv(1-alpha,2,df); Scheffe=sqrt(2*F); B1=repeat(0,1000,1);

Obs=repeat(0,1000,1);

L={1 0}; B1Hat=L*Estimate; SeB1Hat=sqrt(L*Cov*L`);

MinimumB1=B1Hat-Scheffe*SeB1Hat; MaximumB1=B1Hat+Scheffe*SeB1Hat;

L={0 1}; T2Hat=L*Estimate; SeT2Hat=sqrt(L*Cov*L`);

MinimumT2=T2Hat-Scheffe*SeT2Hat; MaximumT2=T2Hat+Scheffe*SeT2Hat;

i=1;

do Beta=MinimumB1 to MaximumB1 by SeB1Hat/10;

Obs[i]=i; B1[i]=Beta; i=i+1;

end;

k=i; Obs[k]=k; B1[k]=MaximumB1;

x=repeat(0,k,1);

miny=repeat(0,k,1); maxy=repeat(0,k,1);

minT2=repeat(0,k,1); maxT2=repeat(0,k,1);

minminT2=repeat(0,k,1); maxmaxT2=repeat(0,k,1);

minB1T2=repeat(0,k,1); maxB1T2=repeat(0,k,1);

do i=1 to k;

x[i]=(B1[i]-B1Hat)/sqrt(Cov[1,1]);

miny[i]=-sqrt((scheffe**2-x[i]**2)> 0);

maxy[i]=+sqrt((scheffe**2-x[i]**2) > 0);

minT2[i]=T2Hat+miny[i]*sqrt(cov[2,2]);

maxT2[i]=T2Hat+maxy[i]*sqrt(cov[2,2]);

minminT2[i]=MinimumT2; maxmaxT2[i]=MaximumT2;

if (B1[i]>=0) then do;

minB1T2[i]=B1[i]*minT2[i]; maxB1T2[i]=B1[i]*maxT2[i];

end;

if (B1[i]<0) then do;

maxB1T2[i]=B1[i]*minT2[i]; minB1T2[i]=B1[i]*maxT2[i];

end;

B1=B1[1:k]; Obs=Obs[1:k];

Product=B1hat*T2Hat; SeProduct=.;

MinProduct=min(minB1T2); MaxProduct=max(maxB1T2);

minminB1T2=repeat(minProduct,k,1); maxmaxB1T2=repeat(maxProduct,k,1);

create EllipseData var{ Obs B1 minT2 maxT2 minB1T2 maxB1T2 minminT2 maxmaxT2

minminB1T2 maxmaxB1T2}; append; close EllipseData ;

print alpha df F Scheffe, B1Hat SeB1Hat MinimumB1 MaximumB1, T2Hat SeT2Hat

MinimumT2 MaximumT2, Product SeProduct MinProduct MaxProduct;

finish;

run init; alpha=&alpha.; df=&n.; run ellipse;

proc means data=EllipseData noprint;

var b1;

output out=numobs n=numobs;

run;

data _NULL_;

set numobs;

call symput('k', numobs);

run;

%put &k;

title ' Simultaneous 95% Confidence Limits for B1, T2, and their Product Based on 95% Confidence Ellipse';

data ScheffeRectangle; set EllipseData;

k=&k.; /*k = max number of observations*/

T3=0;

if obs=1 then do;

B1=B1; T2=minminT2; B1T2=minminB1T2; output;

B1=B1; T2=maxmaxT2; B1T2=maxmaxB1T2; output;

end;

if obs=k then do;

B1=B1; T2=minminT2; B1T2=minminB1T2; output;

B1=B1; T2=maxmaxT2; B1T2=maxmaxB1T2; output;

end;

obs=k+1; B1=0; T4=minminT2; output;

obs=k+2; B1=0; T4=maxmaxT2; output;

run;

data ScheffeRectangle; set ScheffeRectangle;

if (obs=k+1 or obs=k+2) then do;

minminT2=.; maxmaxT2=.; minT2=.; maxT2=.; T2=.; T3=. ; minB1T2=. ;

maxB1T2=. ; minminB1T2=. ; maxmaxB1T2=. ;

end;

run;

proc sort data=EllipseData; by Obs;

proc sort data=ScheffeRectangle; by Obs;

data &outData.;

merge Ellipsedata ScheffeRectangle; by obs;

run;

%mend;

SAS CODE for Confidence Ellipse macro for when mediator and outcome use non- identity links (log, logit) (also available at https://github.com/wagnerbd/Confidence-Ellipse-for-Mediation)

%macro confidence_ellipseNIE (AdditionalEstimates, CovMatAddEst, alpha, outData, hY, hM, M_reg, A0, Astar);

proc iml worksize=600; reset fw=7;

start init;

use &AdditionalEstimates.;

read all var {Label} into Label; read all var {Estimate} into Estimate;

close &AdditionalEstimates.;

use &CovMatAddEst.; Cov=repeat(0,2,2);

read all var {Label} into Label;

read all var {Cov1} into Cov1;

read all var {Cov2} into Cov2;

Cov[1,1]=Cov1[1]; Cov[1,2]=Cov2[1];

Cov[2,1]=Cov1[2]; Cov[2,2]=Cov2[2];

close &CovMatAddEst.;

print Label Estimate Cov;

finish;

start ellipse;

F=Finv(1-alpha,2,df); Scheffe=sqrt(2*F); B1=repeat(0,1000,1);

Obs=repeat(0,1000,1);

L={1 0}; B1Hat=L*Estimate; SeB1Hat=sqrt(L*Cov*L`);

MinimumB1=B1Hat-Scheffe*SeB1Hat; MaximumB1=B1Hat+Scheffe*SeB1Hat;

L={0 1}; T2Hat=L*Estimate; SeT2Hat=sqrt(L*Cov*L`);

MinimumT2=T2Hat-Scheffe*SeT2Hat; MaximumT2=T2Hat+Scheffe*SeT2Hat;

i=1;

do Beta=MinimumB1 to MaximumB1 by SeB1Hat/10;

Obs[i]=i; B1[i]=Beta; i=i+1;

end;

k=i; Obs[k]=k; B1[k]=MaximumB1;

x=repeat(0,k,1);

miny=repeat(0,k,1); maxy=repeat(0,k,1);

minT2=repeat(0,k,1); maxT2=repeat(0,k,1);

minminT2=repeat(0,k,1); maxmaxT2=repeat(0,k,1);

minNIE=repeat(0,k,1); maxNIE=repeat(0,k,1);

do i=1 to k;

x[i]=(B1[i]-B1Hat)/sqrt(Cov[1,1]);

miny[i]=-sqrt((scheffe**2-x[i]**2) >0); maxy[i]=+sqrt((scheffe**2-x[i]**2)>0);

minT2[i]=T2Hat+miny[i]*sqrt(cov[2,2]);

maxT2[i]=T2Hat+maxy[i]*sqrt(cov[2,2]);

minminT2[i]=MinimumT2; maxmaxT2[i]=MaximumT2;

if &hM.='log' then do;

diff=exp(&M_reg.+B1[i]*&A0.) -exp(&M_reg.+B1[i]*&Astar.);

diffhat=exp(&M_reg.+B1hat*&A0.) -exp(&M_reg.+B1hat*&Astar.);

end;

if &hM.='logit' then do;

diff=(exp(&M_reg.+B1[i]*&A0.)/(1+exp(&M_reg.+B1[i]*&A0.))) -(exp(&M_reg.+B1[i]*&Astar.)/(1+exp(&M_reg.+B1[i]*&Astar.)));

diffhat=(exp(&M_reg.+B1hat*&Astar.)/(1+exp(&M_reg.+B1hat*&Astar.))) -(exp(&M_reg.+B1hat*&Astar.)/(1+exp(&M_reg.+B1hat*&Astar.)));

end;

if &hY.='log' then do;

if (B1[i]>=0) then do;

minNIE[i]=exp(minT2[i]*diff);

maxNIE[i]=exp(maxT2[i]*diff);

end;

if (B1[i]<0) then do;

maxNIE[i]=exp(minT2[i]*diff);

minNIE[i]=exp(maxT2[i]*diff);

end;

if &hY.='logit' then do;

if (B1[i]>=0) then do;

minNIE[i]=exp(minT2[i]*diff)/(1+exp(minT2[i]*diff));

maxNIE[i]=exp(maxT2[i]*diff)/(1+exp(maxT2[i]*diff));

end;

if (B1[i]<0) then do;

maxNIE[i]=exp(minT2[i]*diff)/(1+exp(minT2[i]*diff));

minNIE[i]=exp(maxT2[i]*diff)/(1+exp(maxT2[i]*diff));

end;

if &hY.='log' then do;

NIE=exp(T2Hat*diffhat);

end;

if &hY.='logit' then do;

NIE=exp(T2Hat*diffhat)/(1+exp(T2Hat*diffhat));

end;

B1=B1[1:k]; Obs=Obs[1:k];

SeNIE=.;

Min_NIE=min(minNIE); Max_NIE=max(maxNIE);

minminNIE=repeat(Min_NIE,k,1); maxmaxNIE=repeat(Max_NIE,k,1);

create EllipseData var{ Obs B1 minT2 maxT2 minNIE maxNIE minminT2 maxmaxT2};

append; close EllipseData ;

print alpha df F Scheffe, B1Hat SeB1Hat MinimumB1 MaximumB1, T2Hat SeT2Hat

MinimumT2 MaximumT2, NIE SeNIE Min_NIE Max_NIE;

finish;

run init; alpha=&alpha.; df=&n.; run ellipse;

proc means data=EllipseData noprint;

var b1;

output out=numobs n=numobs;

run;

data _NULL_;

set numobs;

call symput('k', numobs);

run;

%put &k;

title ' Simultaneous 95% Confidence Limits for B1, T2, and their Product Based on 95% Confidence Ellipse';

data ScheffeRectangle; set EllipseData;

k=&k.; /*k = max number of observations*/

T3=0;

if obs=1 then do;

B1=B1; T2=minminT2; output;

B1=B1; T2=maxmaxT2; output;

end;

if obs=k then do;

B1=B1; T2=minminT2; output;

B1=B1; T2=maxmaxT2; output;

end;

obs=k+1; B1=0; T4=minminT2; output;

obs=k+2; B1=0; T4=maxmaxT2; output;

run;

data ScheffeRectangle; set ScheffeRectangle;

if (obs=k+1 or obs=k+2) then do;

minminT2=.; maxmaxT2=.; minT2=.; maxT2=.; T2=.; T3=. ;

end;

run;

proc sort data=EllipseData; by Obs;

proc sort data=ScheffeRectangle; by Obs;

data &outData.;

merge Ellipsedata ScheffeRectangle; by obs;

run;

%mend;

SAS CODE for Analysis of Example 1 - All Variables Continuous

Example 1 Cystic Fibrosis Data for Bivariate Analysis

data EX1;

input id A M Y;

datalines;

269 0.81954 6.71 96.855

278 2.04060 3.33 108.206

279 1.88649 4.64 84.801

345 1.06952 4.42 88.234

348 1.81823 13.44 100.477

367 1.22011 9.75 99.782

368 2.51825 28.43 58.559

375 0.69897 7.13 99.614

377 1.49693 12.43 75.753

386 0.81954 11.78 81.443

404 1.79099 17.75 75.051

414 2.39898 15.93 70.918

527 0.69897 2.08 97.927

568 0.69897 7.87 108.146

570 0.76343 8.69 97.309

572 0.81954 10.38 97.371

601 1.98677 18.71 53.689

602 1.64542 12.18 87.626

640 1.02531 14.33 108.322

667 1.59550 12.45 77.981

672 2.18469 12.09 89.727

679 1.95328 10.08 108.437

694 1.97220 14.33 104.892

712 1.30535 5.17 96.333

761 0.95424 7.33 116.813

767 1.73719 14.17 75.609

902 0.69897 10.88 108.488

924 1.89098 21.92 65.775

;

run;

proc sort data=EX1; by id; run;

Bivariate Analysis of Example 1 Data

Most regression programs compute restricted maximum likelihood estimates MSE1 and MSE2 of the residual variances of the regressions of M on A and Y on A and M, respectively. Instead, NLMIXED computes maximum likelihood estimates S11 and S22 such that MSE1= n1 S11/(n1-2), and MSE2= n2S22/(n2-3), where n1 and n 2 are the numbers of observations in the first and second regressions, respectively. For degrees of freedom in proc NLMIXED statement and all estimate statements we suggest df=v=max(n1,n2)=number of subjects, if there are no missing observations. The number of subjects is denoted with the use of a macro variable &n which is used in both NLMIXED and the confidence ellipse macro.

proc freq data=EX1;

table id/ out=sub;

run;

proc means data=sub;

var id;

output out=numsub n=numsub;

run;

data _NULL_;

set numsub;

call symput('n', numsub);

run;

%put &n;

title ' Joint Distribution of M and Y given A';

proc nlmixed data=EX1 df=&n.;

parms t0=117 t1=-6.4 t2=-1.5 b0=3.1 b1=5.8 ss_m=21.5 ss_y=135.6;

bounds ss_m > 0, ss_y > 0;

* Conditional distribution of M given A;

mu_m =b0 + b1*A;

ll_m= -((m-mu_m)**2)/(2*ss_m)-0.5*log(ss_m);

* Conditional distribution of Y given M and A;

mu_y=t0 + t1*A + t2*M;

ll_y= -((y-mu_y)**2)/(2*ss_y)-0.5*log(ss_y);

* Add log likelihoods of M and Y;

ll_o= ll_m + ll_y;

* Define (a-astar);

delA=(1-0);

model Y ~general(ll_o);

estimate 'CDE=T1(a-astar)' T1*delA df=&n.;

estimate 'NDE=T1(a-astar)' T1*delA df=&n.;

estimate 'NIE=T2*B1(a-astar)' T2*B1*delA df=&n.;

estimate 'TE=NDE+NIE' (T1+T2*B1)*delA df=&n.;

estimate 'MSE1' (&n.+1)*ss_m/((&n.+1)-2) df=%eval(&n.+1);

estimate 'MSE2' &n.*ss_y/(&n.-3) df=&n.;

run;

Bivariate Analysis of Example 1 Data Reparameterized for likelihood based test

With the multivariate approach, it is now possible to test the null hypothesis that with a likelihood based test. The null model is obtained by the following NLMIXED code.

title ' Joint Distribution of Null model';

proc nlmixed data=EX1 df=&n.;

parms t0=117 t1=-6.4 b0=3.1 ss_m=21.5 ss_y=135.6;

bounds ss_m > 0, ss_y > 0;

* Conditional distribution of M given A;

mu_m =b0 ;

ll_m= -((m-mu_m)**2)/(2*ss_m)-0.5*log(ss_m);

* Conditional distribution of Y given M and A;

mu_y=t0 + t1*A;

ll_y= -((y-mu_y)**2)/(2*ss_y)-0.5*log(ss_y);

* Add log likelihoods of M and Y;

ll_o= ll_m + ll_y;

model Y ~general(ll_o);

run;

Generation of Confidence Ellipse and Simultaneous Confidence Limits for Example 1

For the confidence ellipses, output the estimates for and in NLMIXED using the ODS output statements to yield estimates of and for input into the IML routine. The ecov option must be specified to obtain the variance-covariance matrix.

title ' Joint Distribution of M and Y given A';

proc nlmixed data=EX1 df=&n. ecov;

parms t0=117 t1=-6.4 t2=-1.5 b0=3.1 b1=5.8 ss_m=21.5 ss_y=135.6;

bounds ss_m > 0, ss_y > 0;

* Conditional distribution of M given A;

mu_m =b0 + b1*A;

ll_m= -((m-mu_m)**2)/(2*ss_m)-0.5*log(ss_m);

* Conditional distribution of Y given M and A;

mu_y=t0 + t1*A + t2*M;

ll_y= -((y-mu_y)**2)/(2*ss_y)-0.5*log(ss_y);

* Add log likelihoods of M and Y;

ll_o= ll_m + ll_y;

* Define (a-astar);

delA=(1-0);

model Y ~general(ll_o);

estimate 'B1' b1 ;

estimate 'T2' t2 ;

ods output AdditionalEstimates = AdditionalEstimates

CovMatAddEst = CovMatAddEst;

run;

title ' Simultaneous 95% Confidence Limits for B1, T2, and their Product Based on 95% Confidence Ellipse';

%confidence_ellipse (AdditionalEstimates, CovMatAddEst, 0.05, Ellipse1);

proc sort data=Ellipse1; by B1;

proc gplot data=Ellipse1;

plot minminT2*B1=1 maxmaxT2*B1=1 minT2*B1=2 maxT2*B1=2 T2*B1=3 / overlay

vaxis=axis1 haxis=axis2;

axis1 label=(a=90 h=4 f=cgreek 'q' h=2 f=zapf '2') minor=none

value=(h=1.5);

axis2 label=(h=4 f=cgreek 'b' h=2 f=zapf '1') minor=none value=(h=1.5);

symbol1 i=j line=1 color=red w=2 ;

symbol2 i=j line=1 color=blue w=2 l=22;

symbol3 i=hilo line=1 color=red w=2;

run;

SAS CODE and OUTPUT for Analysis of Example 2 - Binary Exposure and Outcome Variables

Example 2 type-1 diabetes for Bivariate Analysis

title ' Model 1a. Joint Distribution of M and Y given A with A*M interaction and Confounder C';

title2 ' a*=0 for diabetic and m*=1.8823=mean M';

proc nlmixed data= Brown ;

parms B0= 1.92 B1=-0.854 B2=0.00788 S=1.18 T0= -3.86 T1=-0.377 T2= 0.505 T3=-0.350 T4=0.0993;

bounds S > 0;

* Conditional normal distribution of M on A;

if M=. then ll_m=0;

else do;

mu_m =b0 + b1*A +b2*C;

ll_m= -((M-mu_m)**2)/(2*S)-0.5*log(S); * ll_m must not be missing;

Z=M;

end;

* Conditional binary distribution of Y on A and M;

if Y=. then ll_y=0;

else do;

mu_y =T0+T1*A+T2*M+T3*A*M +T4*C;

p=exp(mu_y )/(1+exp(mu_y ));

ll_y =Y*log(p)+(1-Y)*log(1-p); * ll_y must not be missing;

Z=Y;

end;

* Add log likelihoods of M and Y;

ll_o= ll_m + ll_y;

* Define dela, a*, and m*;

astar=0;

astar2=1;

mstar=1.88;* m*=mean of M;

dela=astar2-astar;

cmean=42;

* Define counterfactuals;

CDE=(T1+T3*mstar)*dela;

PHI=T2+T3*astar;

PHI2=T2+T3*astar2;

* Define the inverse of the link function of the mediator evaluated at A=A0 and at A=astar;

hM_inv_astar=b0 + b1*astar +b2*cmean;

hM_inv_Astar2=b0 + b1*astar2 +b2*cmean;

NDE=(T1+T3*(hM_inv_Astar+T2*S))*delA;