-
Description
Data with various delimited formats can be read from an external data source
for your analysis. This is a very common situation considering that most of
time we prepare raw data in Excel before various analysis.
Typical data formats (created from your Excel datasets) have
a delimiter (such as a blank, comma, or tab, or user-defined delimiter,
etc.) that separates columns of data values.
Following example describes how to read such various delimited data formats
into your SAS source. Example SAS source is pretty much self-explanatory.
Before importing your data into SAS, there are several preps that you need to do.
- To use a direct-Excel-read-In approach from SAS, your Excel data
should have column label in the 1st row, and column labels should
*not* contain any space. If you have any space in your column label, concatenate,
i.e., a column label "Max Strength," you would concatenate it as "MaxStrength" in Exel.
- Equally, the first label column should start from 1st column -- no leading blank column(s).
- The same manner, the row contaning your column labels should start from 1st row
-- no leading blank row(s).
- Create a folder in your USB thumb drive, let's say, "expdesign", etc.
Then put both SAS source and Excel file together in to the folder "expdesign".
- Connect MoVE.odu.edu by using VMWare client.
- Once connected to MoVE, click and run 'General Lab desktop" to run SAS
- Once the 'General Lab desktop" is ready, at the top of screen,
"Options" and "connect USB drive." - now you should be able to
read/write directly from your USB Thumb drive.
- Start SAS, then first select "Tools/Option/Current working path", and
change the current path to your USB Thmb drive folder, i.e., "X:\ expdesign"
(where X is your USB Thumb drive)
- All set, "File/Open" and open your "xyz.sas" from your USB Thmb drive folder,
i.e., "X:\ expdesign", and "Run"
OPTIONS LINESIZE=92;
/* ----Direct Import from Excel file (*.xlsx) --------------------------- */
PROC IMPORT DATAFILE="lf_BioP.xlsx" /* Excel file to be read & imported into SAS */
DBMS=xlsx REPLACE /* REPLACE will replace the SAS dataset each time you import */
OUT=BioPConc; /* Import data from Excel, then put into a dataset called 'BioPConc' */
SHEET="SAS"; /* only read data from a worksheet "SAS" in "lf_BoiP.xlsx" Excel file */
GETNAMES=Yes; /* use first row labels for SAS variable names as-is in this SAS analysis */
RUN;
PROC PRINT DATA= BioPConc; /* Verify Excel data import went through correctly */
TITLE1 '================================================';
TITLE2 'Native Excel xlsx data Import Example';
TITLE3 'see [lf_BioP.xlsx] datafile below for data structure ';
TITLE4 '================================================';
RUN;
lf_BioP.xlsx -- Native Excel file format
(in 'SAS' worksheet. Column labels in the first row become SAS variable names)
/* After imnporting from Excel file, you now have eight (8) variables in SAS, */
/* */
/* Ldate */
/* n */
/* Mixing_Speed_rpm */
/* TSS_Mainstream */
/* TS_SidestreamSBPR */
/* Influent_OP */
/* Effluent_OP */
/* Diff_OP */
/* */
/* that can be analyzed. */
/* delimiter : type of delimiter */
/* comma delimiter = ',' */
/* tab delimiter = '09'x */
/* ! delimiter = '!' */
/* any delimiter, user-defined */
/* delimiter = 'user-defined' */
/* ----Comma Delimiter (=csv) --------------------------- */
data wq_ase_csv;
/* define character display length of variables to 10 char.s long */
length Substrate $ 10 ;
length Reactor $ 10 ;
infile 'wq_ase.csv' delimiter = ',' ;
input Substrate $ Reactor $ Temperature Salinity DO TSS TKN TP Ecoli FEnterococcus @@;
run;
proc print data=wq_ase_csv;
TITLE1 '================================================';
TITLE2 'Excel CSV data (comma delimited) Import Example';
TITLE3 'see [wq_ase.csv] datafile for data structure ';
TITLE4 '================================================';
run;
/* ----Tab Delimiter (Excel data dump, i.e., copy&paste and save )----------- */
/* ---- [09x] is the hexadecimal code for <Tab> character */
data wq_ase_tab;
length Substrate $ 10 ;
length Reactor $ 10 ;
infile 'wq_ase.txt' delimiter = '09'x ;
input Substrate $ Reactor $ Temperature Salinity DO TSS TKN TP Ecoli FEnterococcus @@;
run;
proc print data=wq_ase_tab;
TITLE1 '================================================';
TITLE2 'Excel data dump (Tab delimited) Import Example';
TITLE3 'see [wq_ase.txt] datafile for data structure ';
TITLE4 '================================================';
run;
/* ----User-defined Delimiter--------------------------- */
/* ----In this example, using [ ! ] as an user-defined delimiter */
data wq_ase_xyz;
length Substrate $ 10 ;
length Reactor $ 10 ;
infile 'wq_ase.xyz' delimiter = '!' ;
input Substrate $ Reactor $ Temperature Salinity DO TSS TKN TP Ecoli FEnterococcus @@;
run;
proc print data=wq_ase_xyz;
TITLE1 '================================================';
TITLE2 'User-defined delimiter Import Example';
TITLE3 'see [wq_ase.xyz] datafile for data structure ';
TITLE4 '================================================';
run;
/* ----User-defined Delimiter--------------------------- */
/* ----In this example, using [ _|_ ] as an user-defined delimiter */
data wq_ase_udf;
length Substrate $ 10 ;
length Reactor $ 10 ;
infile 'wq_ase.udf' delimiter = '_!_' ;
input Substrate $ Reactor $ Temperature Salinity DO TSS TKN TP Ecoli FEnterococcus @@;
run;
proc print data=wq_ase_udf;
TITLE1 '================================================';
TITLE2 'User-defined delimiter Import Example';
TITLE3 'see [wq_ase.udf] datafile for data structure ';
TITLE4 '================================================';
run;
/* ----Binary Encoded SAS data format -- *.sas7bdat---------------- */
/* ----In this example, stream-out to text via Proc Print -------- */
data new;
set 'wq_ase.sas7bdat';
run;
proc print data='wq_ase.sas7bdat';
TITLE1 '================================================';
TITLE2 'Stream-out to ASCII Text via Proc Print';
TITLE3 'with variable names ';
TITLE4 '================================================';
run;
/* ----Direct Read from a Website Address -- WEBDAV method ----------------- */
DATA wb8997;
FILENAME foo webdav
'http://asellus.cee.odu.edu/sas/.data/western_branch_wq_89-97.dat';
INFILE foo;
INPUT PrePost $ Month $ Day $ Year $ Station $ Depth
TotFe TotMn NH3 TotP @@;
run;
|
wq_ase.csv -- CSV data (comma delimited)
CVS,CVS1,11.01,17.92,12,7.0,1.1,0.04,21.8,25
CVS,CVS2,11.01,17.82,12,7.0,1.1,0.04,21.8,25
CVS,CVS3,11.01,17.92,12,7.0,1.1,0.04,21.8,25
CVS,CVS4,11.01,17.82,12,7.0,1.1,0.04,21.8,25
CVS,CVS5,11.01,17.92,12,7.0,1.1,0.04,21.8,25
CVS,CVS6,11.01,17.82,12,7.0,1.1,0.04,21.8,25
GL,GL1,11.01,17.92,12,7.0,1.1,0.04,21.8,25
GL,GL2,11.01,17.82,12,7.0,1.1,0.04,21.8,25
.
.
.
|
wq_ase.txt -- Excel data dump (Tab delimited)
CVS CVS1 11.01 17.92 12 7.0 1.1 0.04 21.8 25
CVS CVS2 11.01 17.82 12 7.0 1.1 0.04 21.8 25
CVS CVS3 11.01 17.92 12 7.0 1.1 0.04 21.8 25
CVS CVS4 11.01 17.82 12 7.0 1.1 0.04 21.8 25
CVS CVS5 11.01 17.92 12 7.0 1.1 0.04 21.8 25
CVS CVS6 11.01 17.82 12 7.0 1.1 0.04 21.8 25
GL GL1 11.01 17.92 12 7.0 1.1 0.04 21.8 25
GL GL2 11.01 17.82 12 7.0 1.1 0.04 21.8 25
.
.
.
|
wq_ase.xyz -- User-defined delimiter, using "!" as a delimiter
CVS!CVS1!11.01!17.92!12!7.0!1.1!0.04!21.8!25
CVS!CVS2!11.01!17.82!12!7.0!1.1!0.04!21.8!25
CVS!CVS3!11.01!17.92!12!7.0!1.1!0.04!21.8!25
CVS!CVS4!11.01!17.82!12!7.0!1.1!0.04!21.8!25
CVS!CVS5!11.01!17.92!12!7.0!1.1!0.04!21.8!25
CVS!CVS6!11.01!17.82!12!7.0!1.1!0.04!21.8!25
GL!GL1!11.01!17.92!12!7.0!1.1!0.04!21.8!25
GL!GL2!11.01!17.82!12!7.0!1.1!0.04!21.8!25
.
.
.
|
wq_ase.udf -- Another user-defined delimiter, using "_|_" as a delimiter
CVS_!_CVS1_!_11.01_!_17.92_!_12_!_7.0_!_1.1_!_0.04_!_21.8_!_25
CVS_!_CVS2_!_11.01_!_17.82_!_12_!_7.0_!_1.1_!_0.04_!_21.8_!_25
CVS_!_CVS3_!_11.01_!_17.92_!_12_!_7.0_!_1.1_!_0.04_!_21.8_!_25
CVS_!_CVS4_!_11.01_!_17.82_!_12_!_7.0_!_1.1_!_0.04_!_21.8_!_25
CVS_!_CVS5_!_11.01_!_17.92_!_12_!_7.0_!_1.1_!_0.04_!_21.8_!_25
CVS_!_CVS6_!_11.01_!_17.82_!_12_!_7.0_!_1.1_!_0.04_!_21.8_!_25
GL_!_GL1_!_11.01_!_17.92_!_12_!_7.0_!_1.1_!_0.04_!_21.8_!_25
GL_!_GL2_!_11.01_!_17.82_!_12_!_7.0_!_1.1_!_0.04_!_21.8_!_25
.
.
.
|
SAS User Guide (SUG) for Procedures (PROC) used in the Source
|
-
|