-
Description
Three principles of Experimental Design are Replication,
Randomization and Blocking.
Blocking (=Subgrouping) (sub-)categorizes trigger var.(s)
conditions of interest. Blocking is an essential strategy for
your analysis (of ExpDesign outcomes = CTs of dep. var.) when you
have continuous or dispersed trigger conditions.
In such case, without Blocking, you will have 'n' number of block
levels, each onsists of only one trigger condition, and
subsequently only one value/CT of dep. var. per trigger condition
that will make block level comparion literally meaningless.
Most common way to Block is to subgroup your treatment or block
levels into a number of equal-interval subgroups based on its range, max. &
min. in general, or based on a number of arbiturary subgroups (with different intervals)
targeting for evaluating a particular trigger condition(s).
Let's consider following example;
OPTIONS LINESIZE=92;
/* ----Direct Import from Excel file (*.xlsx) --------------------------- */
PROC IMPORT DATAFILE="lf_BioP.xlsx" /* Excel file to be read & imported into SAS */
DBMS=xlsx REPLACE /* REPLACE will replace the SAS dataset each time you import */
OUT=BioPConc; /* Import data from Excel, then put into a dataset called 'BioPConc' */
SHEET="SAS"; /* only read data from a worksheet "SAS" in "lf_BoiP.xlsx" Excel file */
GETNAMES=Yes; /* use first row labels for SAS variable names as-is in this SAS analysis */
RUN;
PROC PRINT DATA= BioPConc; /* Verify Excel data import went through correctly */
TITLE1 '================================================';
TITLE2 'Native Excel xlsx data Import Example';
TITLE3 'see [lf_BioP.xlsx] datafile below for data structure ';
TITLE4 '================================================';
RUN;
lf_BioP.xlsx -- Native Excel file format
(in 'SAS' worksheet. Column labels in the first row become SAS variable names)
/* After imnporting from Excel file, you now have eight (8) variables in SAS, */
/* */
/* Ldate */
/* n */
/* Mixing_Speed_rpm */
/* TSS_Mainstream */
/* TS_SidestreamSBPR */
/* Influent_OP */
/* Effluent_OP */
/* Diff_OP */
/* */
/* that can be analyzed. */
|
Diff_OP, difference in Orthophosphate concentration (=Influent_OP-Effluent_OP)
is the dep. var. that you are evaluting the system response under four ind. var.s
(or controls or trigger conditions) applied to the system/process ;
- Mixing_Speed_rpm
- TSS_Mainstream
- TS_SidestreamSBPR
- Influent_OP
where Mixing_Speed_rpm, a treatment level, is consist of three
subcategories of 150, 50 and 60 rpm, thus already subgrouped and
no need to apply Blocking.
Where remaining three -- TSS_Mainstream, TS_SidestreamSBPR and
Influent_OP - ind. var.s are continuous and dispersed -- you need
to define Blockings (=subgrouping) for them so that you can
evaluate the dep. var., Diff_OP under Blocking (=subgrouping) of
those three ind. var.s -- instead of evaluating system response
under as-is continuous and dispersed values (and ended up with
not being able to compare resultant CTs of dep. var. at all).
Most common way to Block is to subgroup your treatment or block
levels into a number of equal-interval subgroups based on
its range, max. & min. in general, or based on a number of
arbiturary subgroups (with different intervals) targeting for
evaluating a particular trigger condition(s).
Let's use range, max. & min. approach in this example for
those three continuous and dispersed ind. var.s ;
- TSS_Mainstream -- Range(3510), Max(4820), Min(1310)
- TS_SidestreamSBPR -- Range(30510), Max(33850), Min(3340)
- Influent_OP -- Range(2.65), Max(3.87), Min(1.22)
Reflecting range, max. & min., you can Blocking (Subgrouping)
those ind. var.s and update/save new Blocking variable names into
a new Dataset -- start with the initial Dataset "BioPConc" read from
Excel data file, define and add Blockings and save changes into
a new Dataset "BioP_Analysis" and use it for your analysis.
Following SAS code is self-explanatory.
Keep in mind that the resolution of Blocking (=subgrouping) should be carefully considered for your analysis,
and the number of subgrouping is depending on your analysis goal.
/* Classifiy & assign range variables based on ind. var data values */
/* These range vaiables becomes Blockings for treatment & block levels */
/* to compare CTs of dep. var. under */
DATA BioP_Analysis; /* Define a new Dataset name for adding */
/* Blocking range variables */
set BioPConc; /* define which source Dataset will be used
for Blocking. This initial Dataset was
read from Excel data file */
if TSS_Mainstream = '.' then TSSclass = 'N/A';
/* TSSclass is a new Blocking range variable defined for
TSS_Mainstream ind. var. We will use TSSclass subgroups to compare
resultant CTs of dep. var. under different TSS_Mainstream ind. var.
conditions */
else if TSS_Mainstream < 1500 then TSSclass = '<1500 mg/L';
else if TSS_Mainstream <= 2000 then TSSclass = '>1500 & 2000 mg/L';
else if TSS_Mainstream <= 2500 then TSSclass = '>2000 & 2500 mg/L';
else if TSS_Mainstream <= 3000 then TSSclass = '>2500 & 3000 mg/L';
else if TSS_Mainstream <= 3500 then TSSclass = '>3000 & 3500 mg/L';
else if TSS_Mainstream <= 4000 then TSSclass = '>3500 & 4000 mg/L';
else if TSS_Mainstream <= 4500 then TSSclass = '>4000 & 4500 mg/L';
else if TSS_Mainstream <= 5000 then TSSclass = '>4500 & 5000 mg/L';
else if TSS_Mainstream <= 5500 then TSSclass = '>5000 & 5500 mg/L';
if TS_SidestreamSBPR = '.' then TSsubgroup = 'N/A';
/* TSsubgroup is a new Blocking range variable defined for
TS_SidestreamSBPR ind. var. We will use TSsubgroup subgroups to compare
resultant CTs of dep. var. under different TS_SidestreamSBPR ind. var.
conditions */
else if TS_SidestreamSBPR < 5000 then TSsubgroup = '<5000 mg/L';
else if TS_SidestreamSBPR <= 10000 then TSsubgroup = '>5000 & 10000 mg/L';
else if TS_SidestreamSBPR <= 15000 then TSsubgroup = '>10000 & 15000 mg/L';
else if TS_SidestreamSBPR <= 20000 then TSsubgroup = '>15000 & 20000 mg/L';
else if TS_SidestreamSBPR <= 25000 then TSsubgroup = '>20000 & 25000 mg/L';
else if TS_SidestreamSBPR <= 30000 then TSsubgroup = '>25000 & 30000 mg/L';
else if TS_SidestreamSBPR <= 35000 then TSsubgroup = '>30000 & 35000 mg/L';
else if TS_SidestreamSBPR <= 40000 then TSsubgroup = '>35000 & 40000 mg/L';
else if TS_SidestreamSBPR <= 45000 then TSsubgroup = '>40000 & 45000 mg/L';
else if TS_SidestreamSBPR <= 50000 then TSsubgroup = '>45000 & 50000 mg/L';
else if TS_SidestreamSBPR <= 55000 then TSsubgroup = '>50000 & 55000 mg/L';
[ Repeat Blocking/Subgrouping for Influent_OP ]
/* IOPsubgroup is Blocking range variable for
Influent_OP ind. var */
OUT=BioP_Analysis; /* define which Dataset the changes
will be saved into. This new Dataset is
the one you will use in your analysis */
RUN;
PROC GLM DATA=BioP_Analysis; /* use new Dataset, BioP_Analysis, for analysis */
/* Treatment & Block to be evaluated under */
CLASS Mixing_Speed_rpm TSSclass TSsubgroup IOPsubgroup;
/* RCB model expansion with Blocking range variables added to ind. var.s */
MODEL Diff_OP = Mixing_Speed_rpm TSS_Mainstream TS_SidestreamSBPR
TSSclass TSsubgroup IOPsubgroup;
MEANS /* MMC */
Mixing_Speed_rpm TSSclass TSsubgroup IOPsubgroup
/ Duncan;
/* Compare CTs of dep. var., Diff_OP under Blockings(=Subgroupings) */
RUN;
|
SAS User Guide (SUG) for Procedures (PROC) used in the Source
|
-
|