Outline of Levin and Fox, Chapter 1 (2003)

Outline of Levin and Fox, Chapter 1 (2003)

Topic10. Nonadditive Relationships

Topics

What is a nonadditive relationship (a.k.a. an ‘interaction effect’)?

Computing interactions in SPSS

Running the model and interpreting the results

What is a nonadditive relationship?

  • One assumption of ordinary least squares regression (OLS) is that the relationship between Y and X is linear and additive in the population
  • We have already discussed linearity and how to evaluate nonlinear relationships
  • ‘Additive’ means that the relationship between Y and X1 does not depend on any other variable. Here are some examples of additive and nonadditive relationships:
  • Additive: the gender gap in pay is the same regardless of race
  • Non-additive: the gender gap in pay varies by race (i.e., the gap between men’s and women’s income is smaller or larger for some racial groups)
  • Additive: the relationship between income and education is the same regardless of sex
  • Non-additive: the relationship between income and education varies by sex (i.e., there is a different relationship between income and education for men and women).
  • When a relationship is non-additive, we say that there is an ‘interaction effect’
  • You can detect interaction effects through data exploration
  • Also, theories and/or previous research may suggest the presence of interactions:
  • The ‘buffer hypothesis’ suggests that the relationship between traumatic events and mental health depends on the level of social support. In other words, there is a different slope describing the relationship between traumatic events and mental health for people with different levels of social support:

  • It is possible to estimate nonadditive relationships in OLS regression:
  • To test for interaction effects, you first have to compute an interaction term
  • To compute an interaction, you multiply the variables together:
  • compute interaction=x1*x2.
  • For interval-ratio level variables, it is best to use the centered version to compute the interaction; this will simplify the interpretation of coefficients
  • To estimate the model, you must include the original variables and the interaction term in the regression model (i.e., x1, x2, and x1*x2)
  • You can have interactions involving any combination of dummy and interval-ratio variables
  • We will focus on two-way interactions – that is, an interaction between two variables (other more complex interactions are also possible, such as three-way interactions)

Examples

A two-way interaction between two nominal variables (sex and race)

The questions guiding the analysis:

  • Is there a gender gap in pay and does it vary by race?
  • Is there a race gap in pay and does it vary by gender?
  • These questions are related; people generally focus on one of these (based on whichever is their variable of interest – for example, sex or race).

*Sex and Race.

compute male_white=male*race_white.

compute male_black=male*race_black.

compute male_hispanic=male*race_hispanic.

compute male_other=male*race_other.

freq vars=male_white male_black male_hispanic male_other.

REGRESSION

/DESCRIPTIVES MEAN STDDEV CORR SIG N

/MISSING LISTWISE

/STATISTICS COEFF OUTS CI(95) R ANOVA

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT income

/METHOD=ENTER male race_white male_white.

Sex, race, and their interaction explain 6.8% of the variation in income.

The F Test: Is our model better than a model with no variables – that is, have we improved our ability to predict income by controlling for sex, race, and their interaction?

H0: Sex=Race=Interaction=0

H1: Not H0

Critical F=2.60 (d.f.1=3, d.f.2=5,132, alpha=.05)

Observed F=125.052

We reject the null hypothesis and conclude that we have significantly improved our ability to predict income by controlling for sex, race, and their interaction. The probability of obtaining our observed F value assuming that the null hypothesis is correct is less than our alpha of .05.

Notice that the slope for the interaction term is statistically significant (t=4.592).

How can we interpret this interaction?

The significant interaction term suggests that the relationship between income and sex varies by race in the population (and also that the relationship between income and race varies by sex in the population).

You can make sense of this by calculating the predicted values:

Male (0=female, 1=male); White (0=else, 1=White); White_Male (0=else, 1=White male)

  • White males: $41,970.15= 18,232.36 + 9,841.78*1 + 2,971.03*1 + 10,924.98*1
  • White females: $21,203.39= 18,232.36 + 9,841.78*0 + 2,971.03*1 + 10,924.98*0
  • Non-White males: $28,074.14= 18,232.36 + 9,841.78*1 + 2,971.03*0 + 10,924.98*0
  • Non-White females: $18,232.36 = 18,232.36 + 9,841.78*0 + 2,971.03*0 + 10,924.98*0
  • Notice that the gender gap in pay varies by race:
  • White: $20,766.76 = 41,970.15 – 21,203.39
  • Non-White: $9,841.78 = 28,074.14 – 18,232.36
  • The gender gap in pay is larger for those who are White
  • Notice that the race gap in pay varies by gender:
  • Male: $13,896.01 = 41,970.15 – 28,074.14
  • Female: $2,971.03 = 21,203.39 – 18,232.36
  • The race gap in pay is larger for men than women

A two-way interaction between one nominal and one interval-ratio variable (sex and education)

The questions guiding the analysis:

  • Is the relationship between income and education the same for men and women?
  • Is the gender gap in pay the same at all levels of education?

*Sex and Education.

compute male_educ=male*educ_centered.

REGRESSION

/DESCRIPTIVES MEAN STDDEV CORR SIG N

/MISSING LISTWISE

/STATISTICS COEFF OUTS CI(95) R ANOVA

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT income

/METHOD=ENTER male educ_centered male_educ.

Sex, education, and their interaction explain 12.8 percent of the variation in income.

The F test suggests that our estimated model is better than a model with no variables.

  • The intercept is 20,698.87, which is the predicted income for women with average education (because women are the reference category on the male dummy variable and because education is mean-centered).
  • With a focus on education (this is easier to understand in my opinion):
  • The slope for education is 2,368.24; this is the education slope for the reference category on sex (i.e., women). For women, each additional year of education increases income by $2,368.24.
  • The slope for the interaction term is 1,562.69; this is the additional impact of education on income for men. For men, each additional year of education increases income by $3,930.93 (2,368.24+1,562.69)

  • With a focus on gender (this is more difficult to understand in my opinion):
  • The gender gap in pay is equal to 17,401.56 for those with average levels of education (because education is mean-centered) – that is, men earn $17,401.56 more than women, on average, when both the men and the women have average education
  • $17,401.56 is the gender pay gap for those with average education, but the interaction suggests that the size of the gender gap varies with education; there is a 1,562.69 change in the original 17,401.56 value for each year that we move away from the mean education
  • With each additional year of education past the mean education, the gender gap in pay increases by 1,562.69
  • With each year of education lower the mean education, the gender gap in pay decreases by 1,562.69

A two-way interaction between two interval-ratio variables (age and education)

The questions guiding the analysis:

  • Is the relationship between income and education the same for people of all ages?
  • Is the relationship between income and age the same for people of all education levels?

*Education and Age.

compute educ_age=educ_centered*age_centered.

REGRESSION

/DESCRIPTIVES MEAN STDDEV CORR SIG N

/MISSING LISTWISE

/STATISTICS COEFF OUTS CI(95) R ANOVA

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT income

/METHOD=ENTER educ_centered age_centered educ_age.

  • The intercept is 30,391.938, which is the predicted income for a person of average education and age (because education and age are mean centered)
  • With a focus on education:
  • The effect for education for those of average age (i.e., when age_centered=0 because age is centered) is 3,598.17 – for those of average age (i.e., 56), each additional year of education increases income by $3,598.17
  • With each additional year of age past the mean age, the education slope decreases by $99.56
  • The mean age is about 56 and the standard deviation is about 5, so a person who is 66 is 2 standard deviations above the mean. The slope for 66 years of age=2,602.61
  • 2,602.61=3,598.17+(10*-99.556)
  • 3,598.17 is the slope when age=0 (i.e., at the mean age because age is centered)
  • I used ‘10’ because 10 is 2 standard deviations above the mean on the centered age variable
  • I used ‘-99.556’ because this is the slope for the interaction
  • With each year of age lower than the mean age, the education slope increases by $99.56
  • A person who is 46 is 2 standard deviations below the mean age of 56. The slope for 46 years of age=4,593.73
  • 4,593.73=3,598.17+(-10*-99.556)
  • 3,598.17 is the slope when age=0 (i.e., at the mean age because age is centered)
  • I used ‘-10’ because -10 is 2 standard deviations below the mean on the centered age variable
  • I used ‘-99.556’ because this is the slope for the interaction

  • With a focus on age:
  • The effect for age for those with average education (because education is centered) is 146.40 – for those of average education, each additional year of age increases income by $146.40
  • With each additional year of education past the mean education, the age slope decreases by $99.56
  • With each year of education lower the mean education, the age slope increases by $99.56

One Final Multivariate Regression

Notice that I entered the interactions in block 2.

*Multivariate Regression.

REGRESSION

/DESCRIPTIVES MEAN STDDEV CORR SIG N

/MISSING LISTWISE

/STATISTICS COEFF OUTS CI(95) R ANOVA CHANGE

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT income

/METHOD=ENTER male race_white educ_centered age_centered

/METHOD=ENTER male_white male_educ educ_age.

The F Test: Is our model with interactions (Model 2 above) better than a model with no interactions (Model 1 above) – that is, have we improved our ability to predict income by controlling for the three interactions?

H0: Sex*RaceSex*Education =Education*Age=0

H1: Not H0

Critical F=2.60 (d.f.1=3, d.f.2=5,128, alpha=.05)

Observed F=11.925

We reject the null hypothesis and conclude that we have significantly improved our ability to predict income by controlling for the three interactions. The probability of obtaining our observed F value assuming that the null hypothesis is correct is less than our alpha of .05.

R Square: Sex, race, education, age, and the three interactions explain 13.2 percent of the variation in income.

From Model 2:

  • The Intercept: the predicted income for a non-White female with average education and of average age is $19,966.25

Sex and Race

How to Calculate the Predicted Incomes:

  • Non-White Female: intercept
  • Non-White Male: Intercept + slope for ‘male’
  • White Female: Intercept + slope for ‘race_white’
  • White Male: Intercept + slope for ‘male’ + slope for ‘race_white’ + slope for ‘male_white’

Predicted Income
Male / Female / Gender Gap
White / 39,346.66 / 20,491.71 / 18,854.95
Non-White / 34,675.76 / 19,966.25 / 14,709.51
Race Gap / 4,670.90 / 525.46

The sex*race interaction is significant with a one-tailed test (t=1.743). Results suggest that the gender gap in pay varies by race – it is larger for those who are White ($18,854.95 compared to $14,709.51). Also, the race gap in pay varies by gender – it is larger for men ($4,670.90 compared to $525.46).

Sex and Education

With an Education Focus:

  • The education slope for women is $2,239.79 – for women, each year of education increases education by $2,239.79
  • The education slope for men is $3,903.52 (2,239.79+1,663.73) – for men, each year of education increases education by $3,903.52
  • Conclusion: women are not rewarded (i.e., with respect to wage income) as much as men for education
  • The graph would look similar to the one on page 5

Education and Age

Education focus:

  • The education slope for those of average age is $2,239.79 – for those of average age, each additional year of education increases income by $2,239.79
  • The education slope decreases by $135.36 with every additional year of age
  • Conclusion: older individuals are not rewarded (i.e., with respect to wage income) as much as younger individuals for education
  • The graph would look similar to the one on page 6

Page 1 of 9