Exploratory Methods for Categorical/Nominal Variables

Bivariate Relationships

This assignment focuses on the graphical displays and numerical summaries for examining the relationship between two categorical variables. The problems are similar to the examples shown in class. For help regarding JMP use for this assignment refer to the Bivariate Displays for Categorical Data tutorial. This handout is available in the Tutorials section of the course webpage.

1. SUICIDES IN THE FORMER WEST GERMANY

Data File:Suicide2.JMP in the Categorical JMP folder

Key Words: Contingency table, mosaic plot, correspondence analysis

Topics: Social science, medicine

The data for this example are taken from a study of suicides in the former West Germany in the years 1974to 1977, reported by Van der Heijden and de Leeuw (1985). Nine methods of suicide were tabulated by sexand age category. The primary interest here is in the variation of suicide patterns by age and by sex. Thedata can be regarded as a two-way contingency table, with 34 gender/age categories and 9 methods of suicide.

The variables in this data set are:

  • Gender-Age - variable representing the gender and age of the person who committed suicide. The coding works as follows: m2025 = men 20-25 yrs of age, w5055 = women 50-55 yrs of age, etc.
  • Method - method of suicide (poison, cooking gas, toxic gas, hanging, drown, gun, knife, jump, other)
  • Freq - number of suicides for a given gender-age and method.

The main questions of interest are:

  • Are gender-age and method of suicide independent?
  • If they are not independent, what is the nature of the relationship between these variables?

Questions and Tasks

1. By using the Distribution option examine univariate displays for both gender-age and method of suicide used. Briefly summarize what is displayed in each. (3 pts.)

2. Use the Fit Y by X option to examine the relationship between Gender-Age (X) and Method (Y). Briefly summarize what you learn from this analysis. Use only the mosaic plot and the summary of the Chi-square test for independence in your discussion. Focus on any apparent differences between the genders and any age trends that are evident.
(3 pts.)

3. Finally use correspondence analysis to examine this relationship. To do this select Correspondence Analysis from the pull-down menu at the top of the window. Discuss any gender differences in terms of the method of suicide employed and also any age trends that are apparent in these data. When considering age trends it may be helpful to also look at the mosaic plot. This should be your most thorough summary of your analysis. (6 pts.)

2 - EXAMINING JOB OPPORTUNITIES FOR WOMEN IN THE 1970's vs. 1990’s Data File:GenOcc.JMP in the Categorical JMP folder

Key Words: Contingency table, mosaic plot, correspondence analysis

Topics: Social science

The purpose of this analysis is to compare career opportunities for men and women in the early 1970's versus the early 1990's. One might expect to see much greater equality between genders in terms of occupation type and industry of employment during the later time period. Is that what these data show? All of the people sampled in both eras are exactly the same age and reported being employed. These data constitute a random sample of employed individuals from these two time periods.

Answer the question of interest for both occupation and industry of employment using any/all of the methods discussed in class. Summarize your findings for each in a separate paragraph. Include all supporting material with your write-up. (5 pts. each)

The variables in this data set are:

  • Gender-Era - indicator of gender and era of the respondent (Male70 = male from 1970's, Female70 = female from 1970's, Male90 = male from 1990's, Female90 = female from 1990's)
  • Indus - respondents industry of employment (coded numerically as follows)
  1. Agriculture, Fishery, and Forestry
  2. Mining
  3. Construction
  4. Manufacturing
  5. Transportation, Communication and Other Public Utilities
  6. Wholesale and Retail Trade
  7. Finance, Insurance, and Real Estate
  8. Business and Repair Services
  9. Personal Services
  10. Entertainment and Recreation Services
  11. Professional and Related Services
  12. Public Administration
  13. Armed Forces
  14. Industry Not Reported (Deleted)
  • Occ - respondents occupation (coded numerically as follows)
  1. Professional, Technical, and Kindred Workers
  2. Managers, Officials, and Proprietors except Farm
  3. Clerical and Kindred Workers
  4. Sales Workers
  5. Craftsmen, Foreman, and Kindred Workers
  6. Operative and Kindred Workers (includes mine laborers)
  7. Private Household Workers
  8. Service Workers except Private Household
  9. Farmers and Farm Managers
  10. Farm Laborers and Foreman
  11. Laborers, except Farm and Mine
  12. Members of the Armed Forces
  13. Occupation Not Reported (Deleted)
  14. Occupational Response - "I don't know" (Deleted)