Guide for using SPSS dataset: “STATES Crime”

by J. Spickard

This dataset allows students to analyze some of the social sources of crime rates, using data from the 50 U.S. states and the District of Columbia. The data were collected from various sources, mostly Census Bureau and other government records. They are up-to-date as of 2014. I have not provided full details, because I have not designed this dataset for original research. It simply does a fine job of teaching students how to work with aggregate data.

I have divided the 55 variables into three categories. The first category contains two variables: the state name and the region of the country it is in (Northeast, Midwest, South, or West). The second category contains 45 variables: demographic data, plus a few variables that various observers have suggested might be relevant to increasing or decreasing crime rates (economic inequality, teen births, voting rates, etc.). The third category—10 variables—contains data related to crimes, imprisonment, and law enforcement.

You can compare any two of the country’s four regions by using a t-test: use Region as the independent variable and variables from Categories 2 and 3 as dependent variables.[1]

You can also use correlations to explore the relationships between any two of the variables in Categories 2 and 3. Correlation can’t tell you anything about cause, but they can tell you which pairs of variables are high or low in the same or in opposite places.

Category One: State/Regional Variables (categorical)


Category Two: Demographic Variables (interval-ratio)

total population / % pop under 18 / % pop over 65 / % pop in urban areas / % pop in rural areas
% pop white not latino / % pop latino / % pop black / % pop asian american / % pop native american
% pop with less than a high school education / % of population with at least a high school diploma / % of population with at least a college
degree / % of population with a graduate degree / % of children enrolled in school
% ages 16-19 not working and not in school / public high school graduation rate (%) / % of high school freshmen not graduating after 4 years / % of high school graduates enrolling in
college / per-pupil spending in public schools (k-12)
median earnings (2010 $) / % of children under 6 living in poverty / % of adults aged 65 & over living in poverty / % of households that are food-insecure / marginally-attached workers (per 10k working-age adults)
labor force
participation rate (%,
ages 16 to 64) / state per capita GDP ($) / 2013 state minimum wage ($ per hour) / state GINI coefficient (measure of
inequality) / poverty rate (% below federal poverty threshold)
child poverty rate (% living in families below the poverty line) / state public assistance spending ($ per capita) / % of population using food stamps / 2013 unemployment rate (% ages 16 and over) / life expectancy at birth (years)
% of infants with low birth-weight / teen births (per 1,000 girls, age 15-19) / % of those 18 & older who smoke / % of adults reporting binge drinking in past 30 days / % of eligible voters who voted, 2012
# who could not vote due to felony convictions, per 100k voting-age population) / % of state legislators who are women / percent of women in state’s congressional delegation

Category Three: Crime Variables (interval-ratio)

# of violent crimes per 100,000 population / # of property crimes per 100,000
population / # of homicides per 100,000 population / % of homicides by firearms / # of suicides per 100,000 population, age-adjusted)
# of rapes per 100,000 population / Incidents of child maltreatment, per 1,000 children / expenditures on state police ($ per resident) / # in jail, per 100,000 population / Expenditures on state prisons
($ per prisoner)


The first activity is to examine each of these variables to see the mean, median, range, standard deviation, and other measures of central tendency and dispersion.

·  In SPSS, click menu item Analyze > Descriptive Statistics > Frequencies, choose your variable(s), then click the Statistics button and check the boxes for the statistics you want. Click Continue, then OK. Your information will be in the “Statistics” box on the Output screen. You’ll also see a (relatively useless) frequency table.[2]

The second activity involves finding correlations between the variables in Categories 2 and 3. For example, are there more crimes in states with more high-school dropouts? Are crimes higher or lower in states that spend more money per-pupil on their schools? Are people healthier in rich states than in poor states? How does economic inequality (measured by GINI) affect crime?

·  Use the “How to produce a Pearson Correlation in SPSS” and “How to make a scatterplot in SPSS” guides to explore your data.

The third activity compares regions of the U.S. on any of these variables. Use a t-test with REGION as the independent (“grouping)” variable and one of the Category 2 or 3 variables as the dependent (“test”) variable. Is the murder rate higher in the South or in the West? How about suicide?

·  See the guide “How to run an independent-sample t-test in SPSS” for details.


[1] For instructions, see the guide “How to run an independent-sample t-test in SPSS”. You can also compare all four regions using Analyze > Compare Means > Means.

[2] You can create a bar graph in SPSS, but it’s a lot of work. It’s easier to export the dataset to Excel and create better graphs there.