STAT6250 Data Entry II Dr. Fan

Reading assignment: Chapters 3, 4

Reading Data Values Separated by Delimiters

SAS Syntax:

DATA dataname;

INFILE ‘the location of the data file’ DSD DLM=’delimiter’;

INPUT var1 var2 … vark;

Newvar = a function of original var’s;

RUN;

Note: DSD does three things:

1) It changes the default delimiter from a blank to a comma

2) It inserts a missing value between any two successive delimiters

3) It ignores quotes (‘ or “)

Note: The delimiter can be represented by its hexadecimal value in quotes, following immediately by x; for example, the hexadecimal value of TAB is 09.

Example: Read in veggies data

/* read in CSV (comma-separated values) files */

data veg1;

infile 'C:\st6250\datafiles\veggies_comma.txt' dsd;

input Name $ Code $ Days Number Price;

run;

/* read in TAB-separated values files */

data veg2;

infile 'C:\st6250\datafiles\veggies_tab.txt' dsd dlm='09'x;

input Name $ Code $ Days Number Price;

run;

Reading Fixed Column Data

SAS Syntax:

INPUT @starting column variable name informat_and_size;

If the delimiter is either a blank or a comma, we can do

INPUT variable name : informat_and_size;

If the delimiter is neither a blank or a comma, define INFORMAT (Section 3.13) first.

Example:

filename fixed url "http://www.sci.csueastbay.edu/~sfan/SubPages/CSUteach/st6250/datafiles/veggies_fixed.txt";

/** naive one, not always working **/

data veg3;

infile fixed;

input name $1-8

code $10-16

days 18-19

number 21-24

price 26-28;

run;

/** method 1 **/

data veg3;

*infile 'C:\Documents and Settings\ss152s21\Desktop\veggies_fixed.txt';

infile fixed;

input @1 Name $8.

@10 Code $7.

@18 Days 2.

@21 Number 4.

@26 Price 3.;

run;

/** method 2: for blank or comma delimiters only **/

data veg3;

infile fixed;

input Name: $8.

Code: $7.

Days: 2.

Number: 4.

Price: 3.;

run;

/** method 3: use informat **/

data veg3;

infile fixed;

informat Name $8.

Code $7.

Days 2.

Number 4.

Price 3.;

input name code days number price;

run;

proc print data=veg3;

run;

Exercise: Read in veggies_large.txt using the three methods, called veg5.

Other Tips

Example: Reading names, address and so on (Program 3-13)

/* incorrect way */

data list;

input code $3. name $20. dob mmddyy10. salary;

datalines;

001 Christopher Mullens 11/12/1955 $45,200

002 Michelle Kwo 9/12/1955 $78,123

003 Roger W. McDonald 1/1/1960 $107,200

;

run;

proc print data=list;

run;

Exercise: fix it!

Modifying Data Sets: PROC SET

SAS Syntax:

DATA dataname_after_modification;

SET dataname_to_be_modified;

Your modification here

RUN;

Example:

/* adding more variables to a dataset */

data veg6;

set veg5;

costperseed=price/number;

run;

Saving Your SAS Data

SAS Syntax:

/* create library of datasets */

LIBNAME libraryname ‘directory to store your library’;

DATA libraryname.dataname;

data entry here;

RUN;

/* read the datasets in a library */

PROC CONTENTS DATA=libraryname.dataname VARNUM;

RUN;

PROC PRINT DATA=libraryname.dataname;

RUN;

Note: Must add the following line to access files when open a new SAS session:

LIBNAME libraryname ‘directory to store your library’;

Example:

/* creating permanent sas data sets */

libname stat6250 'C:\CSU\SAS';

data stat6250.veggies;

infile 'C:\st6250\datafiles\veggies_large.txt';

input Name : $8.

Code : $7.

DOA : mmddyy10.

Days : 2.

Number : 4.

Price : dollar4.;

/* label/format all sepcial data variables */

label DOA='date of germination';

format doa mmddyy10.

price dollar7.2;

run;

proc contents data=stat6250.veggies varnum;

run;

proc print data=stat6250.veggies;

run;

/* read a library */

libname stat6250 'C:\CSU\SAS';

proc means data=stat6250.veggies;

var price;

run;

Exercise: Do Chapter 4 problem 3

4