type1&type2errrors

8/10/2019 type1&type2errrors

1/24

Lecture 2: Correlations

! read:Chapter 4

!

practical:Chapter 4


2/24

last week

!

mean

! variance

! standard deviation

! standard error

!

inferential test statistics = variance explained by the model

variance not explained by the model

S=(X"X )

2

#N"1

"x=

s

N


3/24

the mistakes we can make

! we think weve accounted for more systematicvariance than unsystematic!

i.e. theres a statistically significanteffect

! but there isnt - a TYPE I error

if our criterion is p


4/24

minimising mistakes

! effect size

!

how close are the predictions of the model to the

observed outcomes?

! you can correlate the predicted vs. the observed

a smalleffect: r=.1

a mediumeffect: r=.3

a largeeffect: r=.5

! and so we calculate how much of the variance we

have explained(and how good our model is!)

youll have to wait until next week, on correlation...


5/24

minimising mistakes

! statistical power!

the power of a test is the probability that a given testwill find an effect assuming that one exists in the

population

power = [1-p(Type II error)]

! Cohen suggested we aim for an 80% chance ofdetecting an effect if one genuinely exists

!

to calculate power

select !(.05), find effect size (r), enter no. of participants

or, instead, calculate no. of participants given anticipatedeffect size, !, and Cohens .8 power criterion

for a small effect (r=.1), N=783

for a medium effect (r=.3), N=85

for a large effect (r=.5), N=28


6/24

correlations...

! something a littlemore powerful thanthe mean

! we assume thatthere is a linearrelationship between

two variables (thelinear model: fitting astraight line to ourdata)


7/24

scatterplots


8/24

step 1: covariance

! when one variable deviates from its mean, theother variable deviates from its mean in a

similar way

does variance inone variable predict

variance in theother?


9/24


10/24

how do we calculate it?

!

covariance:

! but, the more observations the larger thenumber... so we standardizeit

!

cf. z-scores

z =X"X

s

(x i "x)(y i "y)#(N"1)


11/24

Step 2:

(x i "x)(y i "y)#(N"1)sxsy

(x i "x)(y i "y)#(N"1)

!

covariance:

! but, the more observations the larger thenumber... so we standardizeit

!

Pearsons R


12/24

nice things about correlations

!

the equation is not unlike that for variance

! the equation forces a result between +1 (theycovary perfectly and in the same way), 0 (there

is no covariance at all), and -1 (they covary

perfectly but in the opposite way).

! r2 is a measure of how much variability in one

variable can be explainedby variability in the

other.

(x i "x)(y i "y)#(N"1)sxsy

(xi"x)(x

i"x)#

(N"1)


13/24

r-squared...

! if I know the variance innumber of adverts shown, Ican predict x% of the

variance in packets eaten.

! for each unit of variance inadverts shown, we get xunits of variance in packets

eaten

(xi"x)(y i"y)

(N"1)sxsy#$

%&&

'

())

2


14/24

correlation: a summary

!

the correlation is a measure of thestrength of the relationship between one

variable and another.

! hence its use in calculating effect sizeandpower

!

Pearsons r calculated when both

variables are on continuous (interval)scales.


15/24

correlation and causality

! correlating 2 variables may miss an important

relationship with a 3rd unmeasured variable

! what causes what?

! correlations do not imply causality!

high anxiety correlates with lower exam performance

does a state of anxiety causeworse marks? NO

high anxiety correlates with having done less revision

less revision correlates with lower exam performance


16/24

different types of correlation I

! Pearsons r is for parametric data:!

both normally distributed, on interval scales

! or... if one variable has just two categories

the t-test!

! Spearmans "(rho, rs)! non-normal (e.g. ordinal, such as grades)

! works by ranking the data, and then running

Pearsons r on the ranked data

! Kendalls #(tau)! for small datasets, many tied ranks

!

possibly better than Spearmans...


17/24

different types of correlation II

!

Biserial correlation

! when one variable is dichotomous, but there

is an underlying continuum (e.g. pass/fail onan exam)

! point-biserial correlation

!

when one variable is dichotomous, and it is atrue dichotomy (e.g. gender)


18/24

bivariate vs. partial correlation

!

bivariate correlations tell you how muchvariance is shared (and typically it is calculated

between two variables).

! partialcorrelations tell you how much ofthe unshared variance is actually shared

with a thirdvariable (more or less..)


19/24


20/24


21/24


22/24

a graphical account of partial correlation


23/24

a graphical account of partial correlation

" the (bivariate) correlation is a little like fitting a line to

the data points (= simple regression)

"

each points distance from the line (the

residual

) isthe error relative to the model - i.e. its variance that

cannot be explained

" a 3rd variable (e.g. age)

might correlate with (i.e.predict) some of that

variance20

30

4040

age


24/24

finally

!

you can ignore the distinction betweenpartial and semi-partial correlations (see

HowellStatistical methods for

psychologyif you are interested!)

!

next week:

regression (incl. multipleregression)

Documents

type1&type2errrors