'Life is after all a recursive summation, indeed     Let's do some Statistics!

Department of Civil and Environmental Engineering
Frank Batten College of Engineering and Technology
Old Dominion University
Norfolk, Virginia 23529-0241, USA
Tel) (757) 683-3753
Fax) (757) 683-5354


	
Return to CEE 700/800 Homepage
CEE 700/800 Access Counter
 
Go back to
SAS Source Page
Blocking (=Subgrouping) Example
 
Description

Three principles of Experimental Design are Replication, Randomization and Blocking.

Blocking (=Subgrouping) (sub-)categorizes trigger var.(s) conditions of interest. Blocking is an essential strategy for your analysis (of ExpDesign outcomes = CTs of dep. var.) when you have continuous or dispersed trigger conditions.

In such case, without Blocking, you will have 'n' number of block levels, each onsists of only one trigger condition, and subsequently only one value/CT of dep. var. per trigger condition that will make block level comparion literally meaningless.

Most common way to Block is to subgroup your treatment or block levels into a number of equal-interval subgroups based on its range, max. & min. in general, or based on a number of arbiturary subgroups (with different intervals) targeting for evaluating a particular trigger condition(s).

Let's consider following example;


OPTIONS LINESIZE=92;

/* ----Direct Import from Excel file (*.xlsx) --------------------------- */

PROC IMPORT DATAFILE="lf_BioP.xlsx"  /* Excel file to be read & imported into SAS */ 
         DBMS=xlsx REPLACE /* REPLACE will replace the SAS dataset each time you import */
         OUT=BioPConc;  /* Import data from Excel, then put into a dataset called 'BioPConc' */
     SHEET="SAS";       /* only read data from a worksheet "SAS" in "lf_BoiP.xlsx" Excel file */
     GETNAMES=Yes;     /* use first row labels for SAS variable names as-is in this SAS analysis */
RUN;

PROC PRINT DATA= BioPConc;  /* Verify Excel data import went through correctly */
TITLE1 '================================================';
TITLE2 'Native Excel xlsx data Import Example';
TITLE3 'see [lf_BioP.xlsx] datafile below for data structure     ';
TITLE4 '================================================';
RUN;


lf_BioP.xlsx -- Native Excel file format 
(in 'SAS' worksheet. Column labels in the first row become SAS variable names)

/* After imnporting from Excel file, you now have eight (8) variables in SAS, */ /* */ /* Ldate */ /* n */ /* Mixing_Speed_rpm */ /* TSS_Mainstream */ /* TS_SidestreamSBPR */ /* Influent_OP */ /* Effluent_OP */ /* Diff_OP */ /* */ /* that can be analyzed. */

Diff_OP, difference in Orthophosphate concentration (=Influent_OP-Effluent_OP) is the dep. var. that you are evaluting the system response under four ind. var.s (or controls or trigger conditions) applied to the system/process ;

  • Mixing_Speed_rpm
  • TSS_Mainstream
  • TS_SidestreamSBPR
  • Influent_OP

where Mixing_Speed_rpm, a treatment level, is consist of three subcategories of 150, 50 and 60 rpm, thus already subgrouped and no need to apply Blocking.

Where remaining three -- TSS_Mainstream, TS_SidestreamSBPR and Influent_OP - ind. var.s are continuous and dispersed -- you need to define Blockings (=subgrouping) for them so that you can evaluate the dep. var., Diff_OP under Blocking (=subgrouping) of those three ind. var.s -- instead of evaluating system response under as-is continuous and dispersed values (and ended up with not being able to compare resultant CTs of dep. var. at all).

Most common way to Block is to subgroup your treatment or block levels into a number of equal-interval subgroups based on its range, max. & min. in general, or based on a number of arbiturary subgroups (with different intervals) targeting for evaluating a particular trigger condition(s).

Let's use range, max. & min. approach in this example for those three continuous and dispersed ind. var.s ;

  • TSS_Mainstream -- Range(3510), Max(4820), Min(1310)
  • TS_SidestreamSBPR -- Range(30510), Max(33850), Min(3340)
  • Influent_OP -- Range(2.65), Max(3.87), Min(1.22)

Reflecting range, max. & min., you can Blocking (Subgrouping) those ind. var.s and update/save new Blocking variable names into a new Dataset -- start with the initial Dataset "BioPConc" read from Excel data file, define and add Blockings and save changes into a new Dataset "BioP_Analysis" and use it for your analysis. Following SAS code is self-explanatory.

Keep in mind that the resolution of Blocking (=subgrouping) should be carefully considered for your analysis, and the number of subgrouping is depending on your analysis goal.

/* Classifiy & assign range variables based on ind. var data values */
/* These range vaiables becomes Blockings for treatment & block levels */
/* to compare CTs of dep. var. under */

DATA BioP_Analysis; /* Define a new Dataset name for adding */
                    /* Blocking range variables */ 

        set BioPConc; /* define which source Dataset will be used 
                      for Blocking.  This initial Dataset was 
                      read from Excel data file */

        if TSS_Mainstream = '.' then TSSclass = 'N/A';
             /* TSSclass is a new Blocking range variable defined for 
                TSS_Mainstream ind. var.  We will use TSSclass subgroups to compare 
		resultant CTs of dep. var. under different TSS_Mainstream ind. var.
                conditions */

	     else if TSS_Mainstream < 1500 then TSSclass = '<1500 mg/L';
             else if TSS_Mainstream <= 2000 then TSSclass = '>1500 & 2000 mg/L';
             else if TSS_Mainstream <= 2500 then TSSclass = '>2000 & 2500 mg/L';
             else if TSS_Mainstream <= 3000 then TSSclass = '>2500 & 3000 mg/L';
             else if TSS_Mainstream <= 3500 then TSSclass = '>3000 & 3500 mg/L';
             else if TSS_Mainstream <= 4000 then TSSclass = '>3500 & 4000 mg/L';
             else if TSS_Mainstream <= 4500 then TSSclass = '>4000 & 4500 mg/L';
             else if TSS_Mainstream <= 5000 then TSSclass = '>4500 & 5000 mg/L';
             else if TSS_Mainstream <= 5500 then TSSclass = '>5000 & 5500 mg/L';

        if TS_SidestreamSBPR = '.' then TSsubgroup = 'N/A';
             /* TSsubgroup is a new Blocking range variable defined for
                TS_SidestreamSBPR ind. var.  We will use TSsubgroup subgroups to compare
                resultant CTs of dep. var. under different TS_SidestreamSBPR ind. var.
                conditions */

	     else if TS_SidestreamSBPR < 5000 then TSsubgroup = '<5000 mg/L';
             else if TS_SidestreamSBPR <= 10000 then TSsubgroup = '>5000 & 10000 mg/L';
             else if TS_SidestreamSBPR <= 15000 then TSsubgroup = '>10000 & 15000 mg/L';
             else if TS_SidestreamSBPR <= 20000 then TSsubgroup = '>15000 & 20000 mg/L';
             else if TS_SidestreamSBPR <= 25000 then TSsubgroup = '>20000 & 25000 mg/L';
             else if TS_SidestreamSBPR <= 30000 then TSsubgroup = '>25000 & 30000 mg/L';
             else if TS_SidestreamSBPR <= 35000 then TSsubgroup = '>30000 & 35000 mg/L';
             else if TS_SidestreamSBPR <= 40000 then TSsubgroup = '>35000 & 40000 mg/L';
             else if TS_SidestreamSBPR <= 45000 then TSsubgroup = '>40000 & 45000 mg/L';
             else if TS_SidestreamSBPR <= 50000 then TSsubgroup = '>45000 & 50000 mg/L';
             else if TS_SidestreamSBPR <= 55000 then TSsubgroup = '>50000 & 55000 mg/L';

        [ Repeat Blocking/Subgrouping for Influent_OP ]

             /* IOPsubgroup is Blocking range variable for
                Influent_OP ind. var */


        OUT=BioP_Analysis; /* define which Dataset the changes 
                      will be saved into. This new Dataset is 
                      the one you will use in your analysis */
RUN;


PROC GLM DATA=BioP_Analysis;  /* use new Dataset, BioP_Analysis, for analysis */

/* Treatment & Block to be evaluated under */

     CLASS Mixing_Speed_rpm TSSclass TSsubgroup IOPsubgroup;


/* RCB model expansion with Blocking range variables added to ind. var.s */

     MODEL Diff_OP = Mixing_Speed_rpm TSS_Mainstream TS_SidestreamSBPR
                     TSSclass TSsubgroup IOPsubgroup;

     MEANS  /* MMC */
            Mixing_Speed_rpm TSSclass TSsubgroup IOPsubgroup
            / Duncan;

            /* Compare CTs of dep. var., Diff_OP under Blockings(=Subgroupings) */

RUN;

SAS User Guide (SUG) for Procedures (PROC) used in the Source

SUG OPTIONS procedure
SUG TITLE procedure
SUG INFILE procedure
SUG INPUT procedure
SUG FORMAT procedure
SUG LENGTH procedure
SUG DELIMITER procedure
SUG PRINT procedure
Go back to
SAS Source Page

Return to CEE 700/800 Homepage Return to CEE 700/800 Homepage Move to the Top of this page