type1&type2errrors

Embed Size (px)

Citation preview

  • 8/10/2019 type1&type2errrors

    1/24

    Lecture 2: Correlations

    ! read:Chapter 4

    !

    practical:Chapter 4

  • 8/10/2019 type1&type2errrors

    2/24

    last week

    !

    mean

    ! variance

    ! standard deviation

    ! standard error

    !

    inferential test statistics = variance explained by the model

    variance not explained by the model

    S=(X"X )

    2

    #N"1

    "x=

    s

    N

  • 8/10/2019 type1&type2errrors

    3/24

    the mistakes we can make

    ! we think weve accounted for more systematicvariance than unsystematic!

    i.e. theres a statistically significanteffect

    ! but there isnt - a TYPE I error

    if our criterion is p

  • 8/10/2019 type1&type2errrors

    4/24

    minimising mistakes

    ! effect size

    !

    how close are the predictions of the model to the

    observed outcomes?

    ! you can correlate the predicted vs. the observed

    a smalleffect: r=.1

    a mediumeffect: r=.3

    a largeeffect: r=.5

    ! and so we calculate how much of the variance we

    have explained(and how good our model is!)

    youll have to wait until next week, on correlation...

  • 8/10/2019 type1&type2errrors

    5/24

    minimising mistakes

    ! statistical power!

    the power of a test is the probability that a given testwill find an effect assuming that one exists in the

    population

    power = [1-p(Type II error)]

    ! Cohen suggested we aim for an 80% chance ofdetecting an effect if one genuinely exists

    !

    to calculate power

    select !(.05), find effect size (r), enter no. of participants

    or, instead, calculate no. of participants given anticipatedeffect size, !, and Cohens .8 power criterion

    for a small effect (r=.1), N=783

    for a medium effect (r=.3), N=85

    for a large effect (r=.5), N=28

  • 8/10/2019 type1&type2errrors

    6/24

    correlations...

    ! something a littlemore powerful thanthe mean

    ! we assume thatthere is a linearrelationship between

    two variables (thelinear model: fitting astraight line to ourdata)

  • 8/10/2019 type1&type2errrors

    7/24

    scatterplots

  • 8/10/2019 type1&type2errrors

    8/24

    step 1: covariance

    ! when one variable deviates from its mean, theother variable deviates from its mean in a

    similar way

    does variance inone variable predict

    variance in theother?

  • 8/10/2019 type1&type2errrors

    9/24

  • 8/10/2019 type1&type2errrors

    10/24

    how do we calculate it?

    !

    covariance:

    ! but, the more observations the larger thenumber... so we standardizeit

    !

    cf. z-scores

    z =X"X

    s

    (x i "x)(y i "y)#(N"1)

  • 8/10/2019 type1&type2errrors

    11/24

    Step 2:

    (x i "x)(y i "y)#(N"1)sxsy

    (x i "x)(y i "y)#(N"1)

    !

    covariance:

    ! but, the more observations the larger thenumber... so we standardizeit

    !

    Pearsons R

  • 8/10/2019 type1&type2errrors

    12/24

    nice things about correlations

    !

    the equation is not unlike that for variance

    ! the equation forces a result between +1 (theycovary perfectly and in the same way), 0 (there

    is no covariance at all), and -1 (they covary

    perfectly but in the opposite way).

    ! r2 is a measure of how much variability in one

    variable can be explainedby variability in the

    other.

    (x i "x)(y i "y)#(N"1)sxsy

    (xi"x)(x

    i"x)#

    (N"1)

  • 8/10/2019 type1&type2errrors

    13/24

    r-squared...

    ! if I know the variance innumber of adverts shown, Ican predict x% of the

    variance in packets eaten.

    ! for each unit of variance inadverts shown, we get xunits of variance in packets

    eaten

    (xi"x)(y i"y)

    (N"1)sxsy#$

    %&&

    '

    ())

    2

  • 8/10/2019 type1&type2errrors

    14/24

    correlation: a summary

    !

    the correlation is a measure of thestrength of the relationship between one

    variable and another.

    ! hence its use in calculating effect sizeandpower

    !

    Pearsons r calculated when both

    variables are on continuous (interval)scales.

  • 8/10/2019 type1&type2errrors

    15/24

    correlation and causality

    ! correlating 2 variables may miss an important

    relationship with a 3rd unmeasured variable

    ! what causes what?

    ! correlations do not imply causality!

    high anxiety correlates with lower exam performance

    does a state of anxiety causeworse marks? NO

    high anxiety correlates with having done less revision

    less revision correlates with lower exam performance

  • 8/10/2019 type1&type2errrors

    16/24

    different types of correlation I

    ! Pearsons r is for parametric data:!

    both normally distributed, on interval scales

    ! or... if one variable has just two categories

    the t-test!

    ! Spearmans "(rho, rs)! non-normal (e.g. ordinal, such as grades)

    ! works by ranking the data, and then running

    Pearsons r on the ranked data

    ! Kendalls #(tau)! for small datasets, many tied ranks

    !

    possibly better than Spearmans...

  • 8/10/2019 type1&type2errrors

    17/24

    different types of correlation II

    !

    Biserial correlation

    ! when one variable is dichotomous, but there

    is an underlying continuum (e.g. pass/fail onan exam)

    ! point-biserial correlation

    !

    when one variable is dichotomous, and it is atrue dichotomy (e.g. gender)

  • 8/10/2019 type1&type2errrors

    18/24

    bivariate vs. partial correlation

    !

    bivariate correlations tell you how muchvariance is shared (and typically it is calculated

    between two variables).

    ! partialcorrelations tell you how much ofthe unshared variance is actually shared

    with a thirdvariable (more or less..)

  • 8/10/2019 type1&type2errrors

    19/24

  • 8/10/2019 type1&type2errrors

    20/24

  • 8/10/2019 type1&type2errrors

    21/24

  • 8/10/2019 type1&type2errrors

    22/24

    a graphical account of partial correlation

  • 8/10/2019 type1&type2errrors

    23/24

    a graphical account of partial correlation

    " the (bivariate) correlation is a little like fitting a line to

    the data points (= simple regression)

    "

    each points distance from the line (the

    residual

    ) isthe error relative to the model - i.e. its variance that

    cannot be explained

    " a 3rd variable (e.g. age)

    might correlate with (i.e.predict) some of that

    variance20

    30

    4040

    age

  • 8/10/2019 type1&type2errrors

    24/24

    finally

    !

    you can ignore the distinction betweenpartial and semi-partial correlations (see

    HowellStatistical methods for

    psychologyif you are interested!)

    !

    next week:

    regression (incl. multipleregression)