SPSS Caliper Matching Program1

SPSS MATCH WITH CALIPERS

Be aware up front that there is a reason that almost all of the articles you read used something other than SPSS for propensity score matching with calipers. The first thing you need to know is this is done using the Python plug-in for integration technologies, which must be installed AFTER Python is installed IN THE CORRECT DIRECTORY EXPECTED BY SPSS.

(The way I made sure of this personally, is I downloaded the SPSS Plug-in, started the installation and when it asked if Python 2.6.4 was in the c:\Python26 directory, I downloaded that exact version of Python into that exact directory.)

If you don’t have the plug-in installed you can download it here

for IBM® SPSS® Statistics?lang=en

If you don’t have the exact version of Python, you can download it from here.

Once you have Python and the plug-in installed, you can run the first program that runs a logistic regression and creates three data sets. The data set test.sav is your original data set with the propensity scores added, along with a variable treatm, which is 1 if this is the smaller group (to be matched) and 0 if it is the larger group. There is also a variable newid which is just a number from 1 to the number of records in the data set. The data set test1.sav is the subjects in the treatm = 1 group and test0.sav has the subjects in the treatm= 0 group.

This program also does a DESCRIPTIVES step that gives you the standard deviation for the propensity score. Unlike SAS, there is not an option to output the logit with SPSS (that logitoption you see in the menu is the logit residual. Not the same thing.) For the FUZZ factor, I used .25 times the standard deviation of the propensity score. I have no better rationale for this than that it is a small number on the same metric as the propensity score itself that is being used for the matching.

If you want to do what I did, then you can run the program in two parts, the prepare.sps will end here and give you the standard deviation of the propensity score which you can then multiply by .25, .1 or whatever you like for your FUZZ factor. Then, you can run the fuzzyex3.sps program using that factor.

If you know what factor you want, you can just open the finalcaliper.sps program, enter everything (directory, dependent, independent, fuzz factor) and run everything all at once. The finalcaliper program is just the prepare.sps and fuzzyex3.sps programs in the same file.

The SPSS/ Python FUZZY module first does an exact match and then, if no exact match is found, selects a case within the fuzz factor given. In this, it will produce a less optimal match than SAS. The reason is that SAS has some features that are not available in SPSS and so doing the match exactly the same as in SAS is not feasible.
HOW TO USE IT

Once you have the Python plug-in correctly installed (no small task), the program should not be that difficult to run.

How to use this program

1. Open SPSS

2. Open the syntax file ‘finalcaliper.sps ‘

3. Change the path in quotes below to be where your data are stored. The example below is for a Mac.

/* Change file path here and only here */

DEFINE !pathd() ‘/Volumes/Mystuff/SaveHere/’ !ENDDEFINE.

The code will work equally well on a PC. On Windows it will look like this.

DEFINE !pathd() ‘c:\users\me\Documents\SaveHere\’ !ENDDEFINE.

In either case, be sure you have that slash, at the end.

4. Change the input file in the quotes below to be your input file. The example below is for a Mac.

/* This is the data set with all of my original data */

DEFINE !readin() ‘/Volumes/Otherplace/HereIs/inputfile.sav’ !ENDDEFINE.

On Windows it will look like this.

DEFINE !pathd() ‘c:\users\me\Documents\SaveHere\input.sav’ !ENDDEFINE.

5. The program assumes your dependent variable is named treatm and coded 0 = control (larger) group and 1 = cases (treatment) group. If that is the case, you don’t need the first COMPUTE statement. If the dependent variable is coded 0,1 but not named treatment, you need to rename it. If it isn’t coded 0 or 1, you need to recode it. Here is an example if you want to use syntax of a file where the larger group was actually the cases and the variable was not named treatm. If you feel more comfortable using the GUI interface, you can also use the menus to compute a new variable or just change the variable name using the variable view. Just be sure you save the changed file before running this program. Note that you also need variable named newid. That is created by the second COMPUTE statement.

GET
FILE= !readin .
* COMPUTE treatm=1 – city_of_injury . <------Delete this statement or modify it.

COMPUTE newid = $CASENUM.

EXECUTE.

6. In the LOGISTIC REGRESSION step change “depend” to be the name of your dependent variable and the variables after ENTER to be the names of your independent variables. Don’t change anything else.

LOGISTIC REGRESSION VARIABLES depend
/METHOD=ENTER V1 v2 othervar morevar1 morevar2
/SAVE=PRED
/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).
SAVE OUTFILE=!pathd + "test.sav" .

7. This step calls the Python fuzzy match module. The only thing you need to change is the FUZZ value.

GET

FILE= !pathd +'test0.sav'.

dataset name supplier.

FUZZY DEMANDERDS=demander SUPPLIERDS=supplier

BY= PRE_1 SUPPLIERID=newid

NEWDEMANDERIDVARS=matchedcaseid FUZZ = .0089

DS3 = matches .

EXECUTE.

7. That’s it. Select all, click the green RUN triangle and your program will run. It takes a few minutes. At the end, you will have a DATA set named test4.sav. You can do any tests for significant difference between means to verify that the match worked. When you close SPSS you will be asked if you want to save test4.sav. Click YES.

COMPLETE FINALCALIPER.SPS PROGRAM

NOTE: Word adds different formatting characters to files. Also, SPSS syntax requires specific spacing for some commands. Do not copy and paste the code from this Word document. Use the nicely formatted for SPSS file propensityLEV.sps instead.

Highlighted items need to be specified

DEFINE !pathd() 'C:\Users\AnnMaria\Documents\LearSPSS\calipers\' !ENDDEFINE.

DEFINE !readin() 'C:\Users\AnnMaria\Documents\LearSPSS\foodinc_1.sav' !ENDDEFINE.

GET

FILE= !readin .

COMPUTE treatm=1 -city_of_injury.

COMPUTE newid = $CASENUM.

EXECUTE.

LOGISTIC REGRESSION VARIABLES depende

/METHOD=ENTER independents

/SAVE=PRED LRESID

/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).

SAVE OUTFILE=!pathd + "test.sav" .

DESCRIPTIVES VARIABLES= PRE_1

/STATISTICS=STDDEV.

EXECUTE.

GET

FILE= !pathd +'test.sav'.

SELECT IF (treatm=1 ).

SAVE OUTFILE=!pathd + 'test1.sav' / KEEP = newidtreatm PRE_1 .

Execute.

GET

FILE= !pathd +'test.sav'.

SELECT IF (treatm=0 ).

SAVE OUTFILE=!pathd + 'test0.sav' / KEEP = newidtreatm PRE_1.

EXECUTE.

GET

FILE= !pathd +'test.sav'.

USE ALL .

SAVE OUTFILE=!pathd + 'test.sav' .

EXECUTE.

* Program to create match.

BEGIN PROGRAM PYTHON.

importspss

print 'Hello, World. '

import FUZZY

print FUZZY

END PROGRAM.

* The section above checks to see that Python is installed properly and that the FUZZY module is installed.

* Python is CASE-SENSITIVE , typing ‘fuzzy’ will not work.

GET

FILE= !pathd +'test1.sav'.

dataset name demander.

EXECUTE.

GET

FILE= !pathd +'test0.sav'.

dataset name supplier.

FUZZY DEMANDERDS=demander SUPPLIERDS=supplier

BY= PRE_1 SUPPLIERID=newid

NEWDEMANDERIDVARS=matchedcaseid FUZZ = .0089

DS3 = matches .

EXECUTE.

* This is the FUZZY module that does the match.

DATASET ACTIVATE demander.

SAVE OUTFILE= !pathd + 'matches.sav'

/COMPRESSED.

EXECUTE.

* Make data set .

GET

FILE= !pathd + 'matches.sav' .

SELECT IF (newid > 0).

SAVE OUTFILE= !pathd + 'test3.sav'

/DROP = newid

/COMPRESSED.

EXECUTE.

GET

FILE= !pathd + 'test3.sav' .

RENAME VARIABLES matchedcaseid = newid.

SAVE OUTFILE= !pathd + 'test3.sav' .

EXECUTE.

ADD FILES FILE= !pathd + 'test3.sav'

/FILE = !pathd + 'matches.sav' .

SAVE OUTFILE= !pathd + 'test4.sav' .

EXECUTE.

GET

FILE= !pathd + 'test4.sav' .

SORT CASES BY newid(A).

SAVE OUTFILE= !pathd + 'test4.sav'

/COMPRESSED.

EXECUTE.

GET

FILE=!readin .

COMPUTE newid = $CASENUM .

SORT CASES BY newid(A).

SAVE OUTFILE= !readin

/COMPRESSED.

EXECUTE.

MATCH FILES FILE= !pathd + 'test4.sav'

/TABLE= !readin /IN=test

/BY newid.

SAVE OUTFILE= !pathd + 'test4.sav' .

EXECUTE.

GET

FILE= !pathd + 'test4.sav'.

SELECT IF (test = 1).

SELECT IF (MISSING(matchedcaseid) ~= 1 or City_of_injury = 0).

EXECUTE.

SAVE OUTFILE= !pathd + 'test4.sav' .

SAVE OUTFILE= !pathd + 'test4.sav' .

EXECUTE.