Christopher Knapp
University of Akron, Fall 2011
Statistical Data Management
Homework #6

Problem Statement

In this homework assignment, I am expecting you to be able to merge data sets. Please refer to Assignment #2 for an explanation of the data set that has 42,694 customers who have purchased an item in the last 12 months. Unfortunately, this time the data was split into the following two data sets:

Data Set #1:

  • Data set download Location: apps on 'Samba Server (R:)' > FridlineStatistical Data Management > Assignment #6Email Test1.csv

Data Set #2:

  • Data set download Location: apps on 'Samba Server (R:)' > FridlineStatistical Data Management > Assignment #6Email Test2.csv

There is a separate data set that describes the e-mail campaign the customer received:

  • 1/2 were randomly chosen to receive an e-mail campaign featuring Men’s merchandise.
  • 1/2 were randomly chosen to receive an e-mail campaign featuring Women’s merchandise.

Observe the following dataset that describes the type of email received associated with each customer:

Data Set #3:

  • Data set download Location: apps on 'Samba Server (R:)' > Fridline > Statistical Data Management > Assignment #6 > Email Indicator.csv

a) Please merge both the Email Test1 and Email Test2 data sets into one file.

b) Merge the variable information from the EmailIndicator dataset to the dataset in part #1.

c) Which e-mail campaign performed the best, the Men’s version, or the Women’s version? Why? Did the campaigns perform different when measured across different metrics, like Visitors, Conversion, and Total Spend?

Note:

  • In SAS, please provide your program and all supporting output.
  • In SPSS, please paste your syntax and all supporting output.
  • In Microsoft Excel, explain how you achieved this merging task. Mention any formulas that you used to complete this task. Also, please provide some supporting evidence of which campaign performed better using Excel features.

Contents

Merging in SAS

Page 1EmailTest1 and EmailTest2

Page 2Add EmailIndicator

Page 3SAS Code

Merging in SPSS

Page 4EmailTest1 and EmailTest2

Page 5Add EmailIndicator

Page 6SPSS Code

Merging in Excel

Page 7EmailTest1 and EmailTest2

Page 8Add EmailIndicator

Performance Analysis

Page 9Men’s Versus Women’s Version

Assignment 6Merging in SAS Page | 1

Merging EmailTest1 and EmailTest2 into SAS

The following table displays the first step in the merging process. Notice there are 42694 observations in this table. The SAS code can be found in the third header of this section, labeled “SAS Code”.

Add EmailIndicator to SAS Data

The following table displays the second step in the merging process. Notice there are 42694 observations in this table. The SAS code can be found in the third header of this section, labeled “SAS Code”.

SAS Code

libnamemylib'\\uanet.edu\ZIPSpace\C\crk32\Classes\F11 Statistical Data Management\Assignment 6\mylib';

data mylib.EmailTest1;

infile'R:\Fridline\Statistical Data Management\Assignment #6\Email Test1.csv'dsddlm=','firstobs=2;

input id recencydollars_spentwomenszip_codenew_customer channel visit conversion spend;

run;

data mylib.EmailTest2;

infile'R:\Fridline\Statistical Data Management\Assignment #6\Email Test2.csv'dsddlm=','firstobs=2;

input id recencydollars_spentwomenszip_codenew_customer channel visit conversion spend;

run;

datamylib.mergedObservations;

set mylib.emailtest1 mylib.emailtest2;

run;

datamylib.EmailIndicator;

infile'R:\Fridline\Statistical Data Management\Assignment #6\Email Indicator.csv'dsddlm=','firstobs=2;

length segment $15;

input id segment$;

run;

procsortdata=mylib.mergedObservations;

by id;

run;

procsortdata=mylib.EmailIndicator;

by id;

run;

datamylib.mergedObservationsAndColumns;

mergemylib.MergedObservationsmylib.EmailIndicator;

by ID;

run;

Assignment 6Merging in SPSS Page | 1

Merging EmailTest1 and EmailTest2 into SPSS

The result of the merge is displayed below – notice the total of 42694 entries. The code for this is under the third header of this section labeled “SPSS Code”.

Add EmailIndicator to SPSS Data

After completing the first step in the merging process, the variable Segment can be added. Notice the addition below. The code is in the next header of this section labeled “SPSS Code”.

SPSS Code

GET DATA

/TYPE=TXT

/FILE="R:\Fridline\Statistical Data Management\Assignment #6\Email Test1.csv"

/DELCASE=LINE

/DELIMITERS=","

/ARRANGEMENT=DELIMITED

/FIRSTCASE=2

/IMPORTCASE=ALL

/VARIABLES=

ID F5.0

Recency F2.0

Dollars_Spent F7.2

Mens F1.0

Womens F1.0

Zip_code F1.0

New_Customer F1.0

Channel F1.0

Visit F1.0

Conversion F1.0

Spend F6.2.

CACHE.

EXECUTE.

DATASET NAME DataSet1 WINDOW=FRONT.

GET DATA

/TYPE=TXT

/FILE="R:\Fridline\Statistical Data Management\Assignment #6\Email Test2.csv"

/DELCASE=LINE

/DELIMITERS=","

/ARRANGEMENT=DELIMITED

/FIRSTCASE=2

/IMPORTCASE=ALL

/VARIABLES=

ID F5.0

Recency F2.0

Dollars_Spent F7.2

Mens F1.0

Womens F1.0

Zip_code F1.0

New_Customer F1.0

Channel F1.0

Visit F1.0

Conversion F1.0

Spend F1.0.

CACHE.

EXECUTE.

DATASET NAME DataSet2 WINDOW=FRONT.

ADD FILES /FILE=*

/FILE='DataSet1'.

EXECUTE.

GET DATA

/TYPE=TXT

/FILE="R:\Fridline\Statistical Data Management\Assignment #6\Email Indicator.csv"

/DELCASE=LINE

/DELIMITERS=","

/ARRANGEMENT=DELIMITED

/FIRSTCASE=2

/IMPORTCASE=ALL

/VARIABLES=

ID F5.0

Segment A13.

CACHE.

EXECUTE.

DATASET NAME DataSet1 WINDOW=FRONT.

DATASET ACTIVATE DataSet2.

SORT CASES BY ID(A).

DATASET ACTIVATE DataSet1.

SORT CASES BY ID(A).

DATASET ACTIVATE DataSet2.

MATCH FILES /FILE=*

/FILE='DataSet1'

/BY ID.

EXECUTE.

Assignment 6Merging in Excel Page | 1

Merging EmailTest1 and EmailTest2 into Excel

For the first merging step, I just copied cells A2:K20695 from EmailTest2 and pasted into cell A22002 in EmailTest1. The resulting paste is shown below:

I sorted this file and saved it as comb1.csv.


Add EmailIndicator to Excel

Next, I opened and sorted the EmailIndicator dataset.

Recognizing that some ID values in comb1 may not be in EmailIndicator (and some in EmailIndicator may not be in comb1), I used a vlookup statement to fill the new Segment attribute in comb1. In the screen shot below, the vlookup statement is displayed for cell L2, and the Segment attribute is filled.

The last step is to copy/paste as plain text for column L so that the data itself can be saved and not the vlookup formula. This will prevent several problems that could occur in later analysis.

Assignment 6Performance Analysis Page | 1

Men’s Versus Women’s Versions

Although the directions claim that half of the recipients were sent the Men’s ad and half were sent the Women’s ad, this is not a true statement. From the excel sheet below (utilizing the countiffunction), 80 more people received the Men’s add than received Women’s add.

Therefore my analysis (through a basic pivot table) relied on averages and not sums. (Note that the average value of an indicator variable is equal to the proportion of thosehaving a positive indicator value).

From the pivot table above, recipients of the Men’s advertisement had a higher proportion visited, proportion converted, and dollar spent. Therefore the Men’s catalog was more successful.