STAT6250 Data Entry II Dr. Fan
Reading assignment: Chapters 3, 4
Reading Data Values Separated by Delimiters
SAS Syntax:
DATA dataname;
INFILE ‘the location of the data file’ DSD DLM=’delimiter’;
INPUT var1 var2 … vark;
Newvar = a function of original var’s;
RUN;
Note: DSD does three things:
1) It changes the default delimiter from a blank to a comma
2) It inserts a missing value between any two successive delimiters
3) It ignores quotes (‘ or “)
Note: The delimiter can be represented by its hexadecimal value in quotes, following immediately by x; for example, the hexadecimal value of TAB is 09.
Example: Read in veggies data
/* read in CSV (comma-separated values) files */
data veg1;
infile 'C:\st6250\datafiles\veggies_comma.txt' dsd;
input Name $ Code $ Days Number Price;
run;
/* read in TAB-separated values files */
data veg2;
infile 'C:\st6250\datafiles\veggies_tab.txt' dsd dlm='09'x;
input Name $ Code $ Days Number Price;
run;
Reading Fixed Column Data
SAS Syntax:
INPUT @starting column variable name informat_and_size;
If the delimiter is either a blank or a comma, we can do
INPUT variable name : informat_and_size;
If the delimiter is neither a blank or a comma, define INFORMAT (Section 3.13) first.
Example:
filename fixed url "http://www.sci.csueastbay.edu/~sfan/SubPages/CSUteach/st6250/datafiles/veggies_fixed.txt";
/** naive one, not always working **/
data veg3;
infile fixed;
input name $1-8
code $10-16
days 18-19
number 21-24
price 26-28;
run;
/** method 1 **/
data veg3;
*infile 'C:\Documents and Settings\ss152s21\Desktop\veggies_fixed.txt';
infile fixed;
input @1 Name $8.
@10 Code $7.
@18 Days 2.
@21 Number 4.
@26 Price 3.;
run;
/** method 2: for blank or comma delimiters only **/
data veg3;
infile fixed;
input Name: $8.
Code: $7.
Days: 2.
Number: 4.
Price: 3.;
run;
/** method 3: use informat **/
data veg3;
infile fixed;
informat Name $8.
Code $7.
Days 2.
Number 4.
Price 3.;
input name code days number price;
run;
proc print data=veg3;
run;
Exercise: Read in veggies_large.txt using the three methods, called veg5.
Other Tips
Example: Reading names, address and so on (Program 3-13)
/* incorrect way */
data list;
input code $3. name $20. dob mmddyy10. salary;
datalines;
001 Christopher Mullens 11/12/1955 $45,200
002 Michelle Kwo 9/12/1955 $78,123
003 Roger W. McDonald 1/1/1960 $107,200
;
run;
proc print data=list;
run;
Exercise: fix it!
Modifying Data Sets: PROC SET
SAS Syntax:
DATA dataname_after_modification;
SET dataname_to_be_modified;
Your modification here
RUN;
Example:
/* adding more variables to a dataset */
data veg6;
set veg5;
costperseed=price/number;
run;
Saving Your SAS Data
SAS Syntax:
/* create library of datasets */
LIBNAME libraryname ‘directory to store your library’;
DATA libraryname.dataname;
data entry here;
RUN;
/* read the datasets in a library */
PROC CONTENTS DATA=libraryname.dataname VARNUM;
RUN;
PROC PRINT DATA=libraryname.dataname;
RUN;
Note: Must add the following line to access files when open a new SAS session:
LIBNAME libraryname ‘directory to store your library’;
Example:
/* creating permanent sas data sets */
libname stat6250 'C:\CSU\SAS';
data stat6250.veggies;
infile 'C:\st6250\datafiles\veggies_large.txt';
input Name : $8.
Code : $7.
DOA : mmddyy10.
Days : 2.
Number : 4.
Price : dollar4.;
/* label/format all sepcial data variables */
label DOA='date of germination';
format doa mmddyy10.
price dollar7.2;
run;
proc contents data=stat6250.veggies varnum;
run;
proc print data=stat6250.veggies;
run;
/* read a library */
libname stat6250 'C:\CSU\SAS';
proc means data=stat6250.veggies;
var price;
run;
Exercise: Do Chapter 4 problem 3
4