5
SAS Session 3 1 Data Analysis Using SAS for W indows Instructor: Professor Peggy Ng March 2000 Notes (3) Linear Models The General Linear Model (GLM) in SAS is one of the most widely used procedures in the SAS/STAT library. Statistical models like linear regression, analysis of variance, analysis of covariance, and multivariate MANOVA can be estimated using this procedure. There are other procedures available in SAS for linear statistical modelling, dependi ng on the type of measurements of the independent variables. For linear models with independen t normally distributed dependent variables, the following procedures are used.  Type of Number of Procedure pre dict or pre dict or Categorical 1 TTEST (2 levels) ANOVA (balanced design) GLM (General) Categorical 2 or more ANOVA (balanced design) GLM (general) Continuous 1 REG (simple regression) GLM (general) Continuous 2 or more REG (multiple regression) GLM (general) Categorical+ REG (must code design Continuous variables for the factors/ Categorical variable) GLM (general) STEPWISE (stepwise regression) When modelling for prediction and/or partial effects of parameters, especially with many continuous independent variables, I personally prefer procedure REG. REG provides many diagnostics statistics to test linear model assumptions. Otherwise, I usually use GLM

SAS3

Embed Size (px)

Citation preview

Page 1: SAS3

7/28/2019 SAS3

http://slidepdf.com/reader/full/sas3 1/5

SAS Session 3

1

Data Analysis Using SAS for Windows

Instructor: Professor Peggy Ng

March 2000Notes (3)

Linear Models

The General Linear Model (GLM) in SAS is one of the most widely used procedures inthe SAS/STAT library. Statistical models like linear regression, analysis of variance,analysis of covariance, and multivariate MANOVA can be estimated using this procedure.There are other procedures available in SAS for linear statistical modelling, depending onthe type of measurements of the independent variables. For linear models with

independent normally distributed dependent variables, the following procedures areused.

Type of Number of Procedurepredictor predictor

Categorical 1 TTEST (2 levels)ANOVA (balanced design)GLM (General)

Categorical 2 or more ANOVA (balanced design)GLM (general)

Continuous 1 REG (simple regression)GLM (general)

Continuous 2 or more REG (multiple regression)GLM (general)

Categorical+ REG (must code designContinuous variables for the factors/

Categorical variable)GLM (general)STEPWISE (stepwise regression)

When modelling for prediction and/or partial effects of parameters, especially withmany continuous independent variables, I personally prefer procedure REG. REG providesmany diagnostics statistics to test linear model assumptions. Otherwise, I usually use GLM

Page 2: SAS3

7/28/2019 SAS3

http://slidepdf.com/reader/full/sas3 2/5

SAS Session 3

2

for its ease to use, no need to create design variables for factors (using the default willbe comparing each level of the factor to the highest level), and its natural set up forrepeated measure ANOVA.

LINEAR CORRELATION could be examined through :(Procedures: PLOT, CORR, REG, STEPWISE )

1. Scatterplot:proc plot;

plot y*x y*x=z; * value of 'z' is used as the plotting symbol ;run;*example: tree.sas;

2. Calculation of Pearson product moment correlation coefficient:proc corr nosimple;

var x y;run;

3. Partial correlation:proc corr nosimple;

var x y;partial z; *corr between x and y partial out z;run;

Proc corr can be used to produce Cronbach's coefficient alpha, a measure of internalconsistency reliability:

proc corr alpha nomiss;var q1 -- q20;run;

4. Regression:proc reg;

model y=x;output out=resid p=yhat r=res;

*This will produce results on fitting a model regressing 'y' on 'x'. The predictedvalues ' yhat ' and the residuals of fit ' res ' are saved in data set namedwork.resid ;

run;

* The syntax for multiple regression is the same. List all the independent variables in themodel after the '=' in the model statement. Example: reg.sas

5. Proc stepwise could be used for stepwise regression.proc stepwise;

model y=w x z/stepwise include=3 details;

Page 3: SAS3

7/28/2019 SAS3

http://slidepdf.com/reader/full/sas3 3/5

SAS Session 3

3

*The 'include=3' option forces Proc Stepwise to include all three variables ( wx z ) into the regression model. Stepwise provides partial R 2 for each predictor,which is sometimes desired. Do not use proc stepwise to select a mindlessmodel;run;

The hypothetical data used in this session include :1. Gender of respondents: f, m2. 2 Treatment Groups :

drug ('viagra', in hope of improving memory), placebo3. Response variables , m=meaning and r=rhyming (test on memory) are

measured at 3 time points after the administration of drugs/placebohas started:m1, m2, m3; r1, r2, r3

There are two BETWEEN subjects factors, each two levels:gender: f mgroup:drug placebo

There are two WITHIN subject factors,time (3 levels): 1 2 3type (2 levels): m r

The data is in h:\Fac\Peggyng\Sascours\glm.datdrug m 5 4 6 1 2 6drug f 4 5 4 8 9 9placebo m 5 6 8 5 4 4placebo f 4 6 5 4 3 4drug m 5 4 3 6 8 9drug f 6 7 5 4 3 7drug f 5 6 8 6 5 8drug m 4 8 6 4 3 8placebo f 6 4 3 2 4 3placebo f 5 4 5 3 2 4placebo m 6 7 8 5 4 4placebo m 7 6 8 4 5 3drug f 8 6 5 4 3 7

drug m 8 5 4 4 3 8drug f 9 7 5 7 5 8drug m 6 7 5 7 5 8placebo f 7 4 3 6 5 6placebo m 6 4 2 2 5 2placebo f 9 8 6 5 3 1placebo m 8 7 6 5 4 3

Page 4: SAS3

7/28/2019 SAS3

http://slidepdf.com/reader/full/sas3 4/5

SAS Session 3

4

/* 1-way ANOVA with dependent variable m3 */proc glm;

class group;model m3=group;means group;

run;/* Factorial (2-way in this example) ANOVA withdependent variable m3 */

proc glm;class gender group;model m3 = gender|group;lsmeans gender group/pdiff stderr;lsmeans gender*group/slice=group;

* The first lsmeans statement causes SAS to test for pairwise comparisons of

adjusted main effects using Tukey’s multiple comparison procedure. The secondlsmeans statement requests SAS to test the effect of ‘ gender ’ at each level of‘group ’;* lsmeans statement produces least squares adjusted means for effects. TheMEANS are a function of sample sizes, but the least squares means are not;run;

/* Regression */proc glm;

class gender group;model r3=r1 gender group gender*group;

run;

proc reg;model r3=r1 grp1 female gp1_f;/* We must have defined design variables for the categorical variables in the dataset before running this step.

grp1=(group=’drug’);female=(gender=’f’);gp1_f=grp1*female;

*/run;

/* REPEATED MEASURES ANOVA */proc glm;

class group gender;model m1-m3=group|gender/nouni;repeated time 3 contrast (3)/ printe short summary;run;

Page 5: SAS3

7/28/2019 SAS3

http://slidepdf.com/reader/full/sas3 5/5

SAS Session 3

5

/* MANOVA */proc glm;

class group gender;model m1-m3=group|gender;

manova h=group|gender;run;