Fun With Structural Equation Modelling in Psychological Research Jeremy Miles IBS, Derby University

Preview:

Citation preview

Fun With Structural Equation Modelling

in Psychological Research

Jeremy Miles

IBS, Derby University

• Structural Equation Modelling

• Analysis of Moment Structures

• Covariance Structure Analysis

• Analysis of Linear Structural Relationships (LISREL)

• Covariance Structure Models

• Path Analysis

Normal Statistics

• Modelling process– What is the best model to describe a set of data– Mean, sd, median, correlation, factor structure,

t-value

Data Model

SEM

• Modelling process– Could this model have led to the data that I

have?

Model Data

• Theory driven process– Theory is specified as a model

• Alternative theories can be tested– Specified as models

Data

Theory A Theory B

Ooohh, SEM Is Hard• It was. Now its not

• Jöreskog and Sörbom developed LISREL– Matrices:xy

– Variables: X Y

– Intercepts:

The Joy of Path Diagrams

Variable

Causal Arrow

Correlational Arrow

Doing “Normal” Statistics

x y

Correlation

Doing “Normal” Statistics

x y

T-Test

Doing “Normal” Statistics

x1

y

One way ANOVA(Dummy coding)

x2

x3

Doing “Normal” Statistics

x1

y

Two- way ANOVA(Dummy coding)

x2

x1 * x2

Doing “Normal” Statistics

x

y

Regression

x

x

Doing “Normal” Statistics

MANOVA

x1

x2

y1

y2

y3

Doing “Normal” Statistics

ANCOVA

x y

z

etc . . .

Identification

• Often thought of as being a very sticky issue

• Is a fairly sticky issue

• The extent to which we are able to estimate everything we want to estimate

X = 4

Unknown: x

x = 4y = 7

Unknown: x, y

x + y= 4x - y = 1

Unknown: x, y

x + y = 4

Unknown: x, y

Things We Know

Things We Want to Know

=

x=4x + y = 4, x - y = 2

Just identified

Can never be wrong

“Normal” statistics are just identified

Things We Know Things We Want

to Know <

x + y = 7

Not identified

Can never be solved

Things We Know Things We Want

to Know >

x + y = 4, x - y = 2, 2x - y = 3

over-identified

Can be wrong

SEM models are over-identified

Identification• We have information

– (Correlations, means, variances)

• “Normal” statistics– Use all of the information to estimate the

parameters of the model– Just identified

• All parameters estimated

• Model cannot be wrong

Over-identification

• SEM– Over-identified– The model can be wrong

• If a model is a theory– Enables the testing of theories

Parameter Identificationx - 2 = y

x + 2 = y

• Should be identified according to our previous rules– it’s not though

• There is model identification– there is not parameter identification

Sampling Variation and 2

• Equations and numbers– Easy to determine if its correct

• Sample data may vary from the model– Even if the model is correct in the population

• Use the 2 test to measure difference between the data and the model– Some difference is OK– Too much difference is not OK

Simple Over-identification

x y

Estimate 1 parameter-just-identified

x y Estimate 0 parameters-over-identified

Example 1

• Rab = 0.3, N = 100

• Estimate = 0.3, SE = 0.105, C.R. = 2.859• The correlation is significantly different from 0

a b

• Model

• Tests the hypothesis that the correlation in the population is equal to zero– It will never be zero, because of sampling

variation– The 2 tells us if the variation is significantly

different from zero

a b

Example 2• Test the model

• Force the value to be zero– Input parameters = 1– Parameters estimated = 0

• The model is now over-identified and can therefore be wrong

a b

• The program gives a 2 statistic

• The significance of difference between the data and the model– Distributed with df = known parameters - input

parameters

• 2 = 9.337, df = 1 - 0 = 1, p = 0.002

• So what? A correlation of 0.3 is significant?

Hardly a Revelation

• No. We have tested a correlation for significance. Something which is much more easily done in other ways

• But– We have introduced a very flexible technique– Can be used in a range of other ways

Testing Other Than Zero• Estimated parameters usually tested against zero

– Reasonable?

• Model testing allows us to test against other values

• 2 = 2.3, n.s.• Example 3

a b

0.15

Example 4: Comparing correlations

• 4 variables– mothers' sensitivity– mothers' parental bonding– fathers' sensitivity– fathers' parental bonding

• Does the correlation differ between mothers and fathers?

M S

M PB

F PB

F S

0.5 0.3

0.1

0.1

0.20.2

• Example 4a– analyse with all parameters free– 0 df, model is correct

• Example 4b– fix FS-FPB and MS-MPB to be equal. – See if that model can account for the data

M S

M PB

F PB

F S

dave dave

2 = 1.82, df = 1p = 0.177

dave = 0.41 (s.e. 0.08)

Latent Variables

• The true power of SEM comes from latent variable modelling

• Variables in psychology are rarely (never?) measured directly– the effects of the variable are measured– Intelligence, self-esteem, depression– Reaction time, diagnostic skill

Measuring a Latent Variable

• Latent variables are drawn as ellipses– hypothesised causal relationship

with measured variables

• Measured variable has two causes– latent variable– “other stuff”

• random error

Latent Measured

x = t + e

• Reliability is:• the square root of proportion of variance in x that is

accounted • the correlation between x and e

MeasuredTrue Score

Error

Identification and Latent Variables

• 1 measured variable– not (even close to) identified

• 4 measured variables– 6 known, 4 estimated

• model is identified

• Need four measured variables to identify the model

• Need to identify the variance of the latent variable– fix to 1

Why oh why oh why?

• Why bother with all these tricky latent variables?

• 2 reasons– unidimensional scale construction– attenuation correction

Unidimensionality

• Correlation matrix

• 2 = 3.65, df = 2, p = 0.16

1.00 0.68 1.00 0.73 0.63 1.00 0.68 0.63 0.69 1.00

Attenuation Correction

• Why bother?– Gets accurate measure of correlation between

true scores

• Why bother– theories in psychology are ordinal– attenuation can only cause relationships to

lower

The Multivariate Case• Much more complex and unpredictable

x1 y1

x2 y2

a c

d

e

b

Some More Models

• Multiple Trait Multiple Method Models (MTMM)

• Temporal Stability

• Multiple Indicator Multiple Cause (MIMIC)

MTMM• Multiple Trait

– more than one measure

• Multiple Method– using more than one technique

• Variance in measured score comes from true score, random error variance, and systematic error variance, associated with the shared methods

What?• Example 6 (From Wothke, 1996)

– Three traits• Getting along with others (G)• Dedication (D)• Apply learning (L)

• Three methods• Peer nomination (PN)• Peer Checklist (PC)• Supervisor ratings (SC)

Matrix 1 .524 1 .241 .403 1 .071 .102 -.018 1 .022 .096 .018 .435 1 .076 .102 .100 .342 .347 1 .136 .132 .061 .243 .203 .100 1-.028 .168 .135 .093 .209 .042 .461 1-.054 .162 .252 .053 .108 .108 .294 .280 1 g.pn d.pn l.pn g.pc d.pc l.pc g.sc d.sc l.sc

Analysis

g.pn l.pn d.pc

pn

g.pn l.pc d.pc

pc

g.sc l.sc d.sc

sc

g l d

Temporal Stability• Usually

– sum the items– correlate them

• BUT– items may not be unidimensional – relationship will be attenuated due to

measurement error– relationship will be inflated, due to correlated

error

L1

X3.1 X4.1 X5.1X2.1X1.1

L2

X3.2 X4.2 X5.2X2.2X1.2

•Corrects for attenuation•But - correlated errors may be a problem

• Added correlated errors

• Example 7b

L1

X3.1 X4.1 X5.1X2.1X1.1

L2

X3.2 X4.2 X5.2X2.2X1.2

MIMIC Model• “Conventional wisdom” in psychological

measurement is that a latent variable is the cause of the measured variables

• Assumption is made (implicitly) in many types of measurement– Bollen and Lennox (1989)– not necessarily the case

Value of a Car• Causes

– type, size, age, rustiness– no reason they should, or should not, be

correlated

• Effects– assessment of value by people who know

Level of Depression• Questionnaire items

– causes or effects?• been feeling unhappy and depressed?• been having restless and disturbed

nights?• found everything getting 'on top' of you?

• MIMIC

Example 8: MIMIC

L1

c1

c2

c3

y4y1

LY1

LY2

y2 y3

y5 y6 y7 y8

Concluding remarks

• Given a taster– some may be too simple?

• Much more to say– no time to say it

• See further reading (Books and WWW)

Further Info

• SEMNET - email list– semnet@bama.ua.edu (messages)– listserv@ bama.ua.edu (leave)– http://www.gsu.edu/~mkteer/semfaq.html

• the semnet FAQ

Books

• See web pagehttp://ibs.derby.ac.uk/~jeremym/fun/fun/index.htm

References

• See web pagehttp://ibs.derby.ac.uk/~jeremym/fun/fun/index.htm