ams572_ch14

Embed Size (px)

Citation preview

  • 7/30/2019 ams572_ch14

    1/133

    What is NONPARAMETRIC Statistics?

    Normality doesnt hold for all data. Similarly, some data may not have any

    particular fixed distribution such as Binomialor Poisson. Such sets of data are called Non-parametric

    data or Distribution-free .

    We use nonparametric tests for thesepopulations.

  • 7/30/2019 ams572_ch14

    2/133

    the population distribution is highly skewed or very heavily tailed.Median is a better measure to find the center than the mean.

    The sample size is small (usually less than 30)and not normal

    (we find that out using SAS orother statistical

    programs).

    When do we use NONPARAMETRIC Statistics?

  • 7/30/2019 ams572_ch14

    3/133

    14.1.1 Sign Test and Confidence Interval

    Sign test for a Single Sample

    We want to test a hypothesis at asignificant level if the true

    median is above a certain knownvalue .

  • 7/30/2019 ams572_ch14

    4/133

    14.1.1 Sign Test and Confidence Interval

    Example:THERMOSTAT DATA:

    Perform the sign test todetermine if the median

    setting is different fromthe design setting of 200 0 F.

    .

    202.2 203.4

    200.5 202.5

    206.3 198.0

    203.7 200.8

    201.3 199.0

  • 7/30/2019 ams572_ch14

    5/133

    14.1.1 Sign Test and Confidence Interval

    STEP 1:

    We find the signs of each sample by comparingwith 200.

    .

    STEP 2:

    0 0

    0

    200

    200a

    H

    H

    202.2 > 200 203.4 > 200

    200.5 > 200 202.5 > 200

    206.3 > 200 198.0 < 200

    203.7 > 200 200.8 > 200

    201.3 > 200 199.0 < 200

    0 8i s x u 0 2i s x u

  • 7/30/2019 ams572_ch14

    6/133

    14.1.1 Sign Test and Confidence Interval

    .

    What do we do if there is a Tie?

    0i x

    1) We can break the tie at random, meaning putting it with either s or s . For a large sample it may not make a big difference,but the result may vary significantly for a small sample.

    2) We can contribute towards each s and s . However, wecan not calculate the p-value using fractions. So, we should

    not do it.3) We exclude the ties. This may reduce the sample size and

    hence the power of the test. For a large sample, it should notbe a big deal.

  • 7/30/2019 ams572_ch14

    7/133

    14.1.1 Sign Test and Confidence Interval

    .

    STEP 3:

    Why Binomial?

    Well, S+ and S- are the only two variables in the sample set, n.

    .

    8~ (10, )10 s Bin 2~ (10, )10 s Bin

    ( )s

    P s pn

    ( ) 1 1 s s P s pn n

    1

    s s n

    s s n

    n n n

    s p

    n

    both s and s are binomially distributed withprobability p and 1-p respectively.

    ~ ( , )S Bin n p AND ~ ( ,1 )S Bin n p

  • 7/30/2019 ams572_ch14

    8/133

    14.1.1 Sign Test and Confidence Interval

    .

    STEP 4:

    Since they both S+ and S- have the same binomial distribution, wecan denote a common r.v S:

    S ~ bin (n, ).

    When 0 0: H is true, the 0 is the true median.Therefore, s s and p=1/2, because the number of

    samples above the median is equal to the number of samples below the media. Consequently, 1-p= too.

    1~ ( , )2

    S Bin n and 1~ ( , )2

    S Bin n

    1~ (10, )2S Bin

  • 7/30/2019 ams572_ch14

    9/133

    14.1.1 Sign Test and Confidence Interval

    .

    Now we can calculate the p-value using thebinomial distribution:

    alternatively,

    1010

    8

    10 18

    8 20

    10. 55

    2

    nn

    i s i

    n P value P S s P S

    i

    102

    00

    01 1

    .022

    0 12 2

    55i

    n s

    i

    n P valu P S e P S s

    i

  • 7/30/2019 ams572_ch14

    10/133

    14.1.1 Sign Test and Confidence Interval

    STEP 5:We compare our p-value with thesignificant level:

    P-value= .055

    At = .05, P-value = .055>.05.

    We fail to reject the null hypothesis.

  • 7/30/2019 ams572_ch14

    11/133

    14.1.1 Sign Test and Confidence Interval

    For large sample :(n 20)We can also approximate it by normal distribution.

    2n E S E S and 4

    nVar S Var S

    / 2 1/ 2

    / 4

    s n z

    n

    where is the continuity

    correction.

    We reject the null hypothesis if z z .

    Equivalently, ,12 2 4 nn n s z b after rearranging.

  • 7/30/2019 ams572_ch14

    12/133

    14.1.1 Sign Test for matched pairs

    Sign test for Matched PairsWhen observations are matched

    Then:- S + = the positive differences- S - = the negative differences

    Note: the magnitude of the differences is notknown

    When pairs are matched P (A,B)= P (B,A)

  • 7/30/2019 ams572_ch14

    13/133

    14.1.1 Sign Test for matched pairs

    No.Method

    AMethod

    B Difference NoMethod

    AMethod

    B Differences

    i x i yi d i i x i yi d i

    1 6.3 5.2 1.1 14 7.7 7.4 0.3

    2 6.3 6.6 -0.3 15 7.4 7.4 0

    3 3.5 2.3 1.2 16 5.6 4.9 0.7

    4 5.1 4.4 0.7 17 6.3 5.4 0.95 5.5 4.1 1.4 18 8.4 8.4 0

    6 7.7 6.4 1.3 19 5.6 5.1 0.5

    7 6.3 5.7 0.6 20 4.8 4.4 0.4

    8 2.8 2.3 0.5 21 4.3 4.3 0

    9 3.4 3.2 0.2 22 4.2 4.1 0.1

    10 5.7 5.2 0.5 23 3.3 2.2 1.1

    11 5.6 4.9 0.7 24 3.8 4 -0.2

    12 6.2 6.1 0.1 25 5.7 5.8 -0.1

    13 6.6 6.3 0.3 26 4.1 4 0.1

  • 7/30/2019 ams572_ch14

    14/133

    14.1.1 Sign Test for matched pairs

    Note that for the matched paired test all tiedentries ( x i = y i ) are disregarded

    Thenn=23 since x i = y i , for i =15,18,21

    S + = 20S - = 3

    Using

    S -n/2 1/ 2

    / 4 z

    n

  • 7/30/2019 ams572_ch14

    15/133

    14.1.1 Sign Test for matched pairs

    .

    Two sided p-value:2(1- (3.336))=0.0008

    -This indicates a significant differencebetween Method A and Method B

    20-23/2-1/23.336

    23/4

  • 7/30/2019 ams572_ch14

    16/133

    14.1.2 Wilcoxon Signed Rank Test

    Who is Frank Wilcoxon?

    Born: September 2 1892Wilcoxon was born to American parents in County Cork, Ireland.

    Frank Wilcoxon grew up in Catskill, New York although hedid receive pat of his education in England. In 1917 hegraduated from Pennsylvania Military College with a B.S. Hethen received his M.S. in Chemistry in 1921 from RutgersUniversity. In 1924 he received his PhD from CornellUniversity in Physical Chemistry.In 1945 he published a paper containing two tests he is mostremembered for, the Wilcoxon signed-rank test and theWilcoxon rank-sum test. His interest in statistics can beaccredited to R.A. Fisher's text, Statistical Methods for Research Worker (1925).

    Over the course of his career Wilcoxon published 70 papers .

  • 7/30/2019 ams572_ch14

    17/133

    14.1.2 Wilcoxon Signed Rank Test

    Who is Frank Wilcoxon?

  • 7/30/2019 ams572_ch14

    18/133

    14.1.2 Wilcoxon Signed Rank Test

    Alternative method to the Sign Test

    The Wilcoxon Signed Rank Test

    Improves on the Sign Test.Unlike the sign test the Wilcoxon Signed Rank

    Test not only looks at whether x i> or x i

  • 7/30/2019 ams572_ch14

    19/133

    14.1.2 Wilcoxon Signed Rank Test

    .

    Note : Wilcoxon Signed Rank Testassumes that the observedpopulation distribution issymmetric.

    (Symmetry is not required for theSign Test)

  • 7/30/2019 ams572_ch14

    20/133

    14.1.2 Wilcoxon Signed Rank Test

    Step 1:Rank order the differences d i in terms of their absolute values.

    Step 2:w += sum r i (ranks) of the positive differences.w -= sum r i (ranks) of the negative differences.

    if we assume no tiesThen

    w + + w - = r 1+ r 2 + + r n = 1 + 2 + 3 + + n= ( 1)

    2

    n n

  • 7/30/2019 ams572_ch14

    21/133

    14.1.2 Wilcoxon Signed Rank Test

    Step 3:reject H 0 if w + is large or if w - is small!!

  • 7/30/2019 ams572_ch14

    22/133

    14.1.2 Wilcoxon Signed Rank Test

    The size of w + , w - needed to reject H 0 at isdetermined using the distributions of thecorresponding W + , W - r.v.s when H0 is true.Since the null distributions are identical and

    symmetric the common r.v. is denoted by W.

    p-value:= P (W w+) = P (W w-)

    Reject H 0 if p-value is

  • 7/30/2019 ams572_ch14

    23/133

    14.1.2 Wilcoxon Signed Rank Test

    1 if ith rank corresponds to a positive sign

    0 if ith rank corresponds to a negative sign

    1

    n

    ii

    W iZ E (w

    +) = E( iZ

    i)

    = E(1Z 1+2Z 2++nZn)

    = E(1Z 1)+E(2Z 2)++E(nZn)

    = 1E(Z 1) + 2E(Z 2)++nE(Zn), [ E(Z 1)= E(Z 2)==E(Zn) ]

    = 1E(Z 1) + 2E(Z 1)++nE(Z1)

    = (1+2+3++n) E(Z1)

    ( 1)

    2

    n n p

    Zi ~ Bernoulli (P)P=p(x i > 0) , P=1/2

    i Z

  • 7/30/2019 ams572_ch14

    24/133

    14.1.2 Wilcoxon Signed Rank Test

    Var (W+) = Var (iZi)

    = Var(1Z 1+2Z 2++nZn)

    = Var(1Z 1)+Var(3Z 2)++Var(nZn)

    = 1Var(Z 1)+ 2Var(Z 2)++nVar(Zn)

    = 1Var(Z 1)+ 2Var(Z 1)++nVar(Z1)

    = (1+2++n) Var(Z1)

    ( 1)(2 1)(1 )

    6n n n

    p p

  • 7/30/2019 ams572_ch14

    25/133

    14.1.2 Wilcoxon Signed Rank Test

    Then a Z-test is based on the statistic:

    H 0 : = 0 H a : 0

    Reject H 0 if Z Z

    ( 1)w 1/ 2

    4( 1)(2 1)

    24

    n n

    n n n z

  • 7/30/2019 ams572_ch14

    26/133

    H 0 : = 0 H a : 0

    reject H 0 if Z Z

    H 0 : = 0 H a : 0

    reject H 0 if (1) Z Z (2) Z -Z

    The two sided p-value is

    2 p ( W wmax ) = 2 p( W wmin )

    14.1.2 Wilcoxon Signed Rank Test

  • 7/30/2019 ams572_ch14

    27/133

    14.1.2 Summary

    Sign Rank Test VS Sign TestWeighs each signeddifference by its rank

    If the positive differences aregreater in magnitude than thenegative differences they geta higher rank resulting in alarger value of w +This improves the power of

    the signed rank test

    But it also affects the type Ierror if the populationdistribution is NOT symmetric

    Counts the number of differences

  • 7/30/2019 ams572_ch14

    28/133

    YOU WOULDNT WANT THIS

    TO HAPPEN!

  • 7/30/2019 ams572_ch14

    29/133

    14.1.2 Summary

    Sign Rank Test VS Sign Test

    PreferredTest

    And the winner is

  • 7/30/2019 ams572_ch14

    30/133

    14.1.2 Summary

    I pity the Fu t hat messes with t he WilcoxonSi gned Rank Test !!!

    http://imdb.com/gallery/mptv/1060/12608-0013.jpg.html?seq=2http://imdb.com/gallery/mptv/1060/12608-0013.jpg.html?seq=2http://imdb.com/gallery/mptv/1060/12608-0013.jpg.html?seq=2
  • 7/30/2019 ams572_ch14

    31/133

    No.

    Method A

    MethodB Difference Rank No Method A MethodB Differences Rank

    i Xi Yi Di i Xi Yi Di

    1 6.3 5.2 1.1 19.5 14 7.7 7.4 0.3 8

    2 6.3 6.6 -0.3 8 15 7.4 7.4 0 -

    3 3.5 2.3 1.2 21 16 5.6 4.9 0.7 15

    4 5.1 4.4 0.7 15 17 6.3 5.4 0.9 17.5

    5 5.5 4.1 1.4 23 18 8.4 8.4 0 -

    6 7.7 6.4 1.3 22 19 5.6 5.1 0.5 12

    7 6.3 5.7 0.6 17.5 20 4.8 4.4 0.4 10

    8 2.8 2.3 0.5 12 21 4.3 4.3 0 -

    9 3.4 3.2 0.2 5.5 22 4.2 4.1 0.1 2.5

    10 5.7 5.2 0.5 12 23 3.3 2.2 1.1 19.5

    11 5.6 4.9 0.7 15 24 3.8 4 -0.2 5.5

    12 6.2 6.1 0.1 2.5 25 5.7 5.8 -0.1 2.5

    13 6.6 6.3 0.3 8 26 4.1 4 0.1 2.5

    14.1.2 Wilcoxon Signed Rank Test

  • 7/30/2019 ams572_ch14

    32/133

    14.1.2 Wilcoxon Signed Rank Test

    .

    w - = 8 + 5.5 + 2.5 = 16thenw + =

    Two sided p-value:

    2(1 (3.695)) = 0.0002

    23(24)260 1/ 24 3.695

    23(24)(47)24

    z

    23(24)16 260

    2

  • 7/30/2019 ams572_ch14

    33/133

    14.1.2 Wilcoxon Signed Rank Test

    If d i = 0 then the observations are dropped andonly the nonzero differences are retained

    Given |d i|s are tied for the same rank a newrank is assigned to them called the midrank.

  • 7/30/2019 ams572_ch14

    34/133

    14.1.2 Wilcoxon Signed Rank Test

    No. A B Diff R No. A B Diff R

    i x i yi d i r i i x i yi d i r i

    15 7.4 7.4 0 - 8 2.8 2.3 0.5 12

    18 8.4 8.4 0 - 10 5.7 5.2 0.5 12

    21 4.3 4.3 0 - 19 5.6 5.1 0.5 12

    12 6.2 6.1 0.1 2.5 7 6.3 5.7 0.6 18

    22 4.2 4.1 0.1 2.5 4 5.1 4.4 0.7 15

    25 5.7 5.8 -0.1 2.5 11 5.6 4.9 0.7 15

    26 4.1 4 0.1 2.5 16 5.6 4.9 0.7 15

    9 3.4 3.2 0.2 5.5 17 6.3 5.4 0.9 18

    24 3.8 4 -0.2 5.5 1 6.3 5.2 1.1 20

    2 6.3 6.6 -0.3 8 23 3.3 2.2 1.1 20

    13 6.6 6.3 0.3 8 3 3.5 2.3 1.2 21

    14 7.7 7.4 0.3 8 6 7.7 6.4 1.3 22

    20 4.8 4.4 0.4 10 5 5.5 4.1 1.4 23

  • 7/30/2019 ams572_ch14

    35/133

    In the new table we see that whenn=12,22,25,26 |d i|=0.1

    Then d 1=d 2=d 3=d 4=0.1

    Then

    Therefore the new ranks of the abovedifferences are not 1,2,3,4 but rather 2.5

    14.1.2 Wilcoxon Signed Rank Test

    1 2 3 4 10 2.54 4

  • 7/30/2019 ams572_ch14

    36/133

    14.2 Inferences for independent samples

    .

    1. Wilcoxon rank sum testAssumption : There are no ties in the two samples.Hypothesis :

    Step1 : Rank all observationsStep2 : Sum the ranks of the two samples

    separately( =sum the ranks of the xs, =sum the ranks of the ys)

    Step3 : Reject null hypothesis if is large or if issmall

    Problem : Distributions of are not same when

    0 0 0: . :a H vs H

    1w2w

    1w 2w

    1 2,W W 1 2n n

  • 7/30/2019 ams572_ch14

    37/133

    14. 2.1 Wilcoxon-Mann-Whitney Test

    .

    1. Mann Whitney test

    Step1 : Compare each with each( = #pairs in which , = #pairs in which )

    Step2 : Reject if is large or is small

    Rank sum test statistic :P-value :For large samples, we approximate to normal, when

    Rejection rule : If then reject

    i x j y

    1u i j x y 2u i j x y

    1 1 2 21 1 2 2

    ( 1) ( 1),

    2 2n n n n

    u w u w

    1 2( ) ( ) P U u P U u

    1 2 1 2 ( 1)( ) , ( )2 12

    n n n n N E U Var U

    11

    ( )2 ,

    ( )

    u E U Z z

    Var U

    0 H

    0 H 1u 2u

  • 7/30/2019 ams572_ch14

    38/133

    14. 2.1. Wilcoxon-Mann-Whitney Test

    Example: Failure Times of Capacitors

    Control Group Stressed Group

    5.2 17.1 1.1 7.2

    8.5 17.9 2.3 9.1

    9.8 23.7 3.2 15.212.3 29.8 6.3 18.3

    7.0 21.1

    Control Group Stressed Group4 13 1 7

    8 14 2 9

    10 17 3 12

    11 18 5 15

    6 16

    : c.d.f. of the control groupand : c.d.f. of the stressedgroup

    T.S. : w1=95, w 2=76, u 1 =59, u 2 =21 P-value =.051 from Table A.11 Compare with large sample

    normal approx :

    P-value = .052

    Table 1 : Times to Failure

    Table 2 : Ranks of Times to Failure

    1 F 2 F

    0 1 2 1 2: . :a H F F vs H F F

    (8)(10) 159

    2 2 1.643(8)(10)(19)

    12

    Z

  • 7/30/2019 ams572_ch14

    39/133

    14. 2.1. Wilcoxon-Mann-Whitney Test

    .

    Null Distribution of the Wilcoxon-Mann-Whitney Test Statistic

    Assumption:

    Under H 0 , all N= n 1 + n 2 observationscome from the common distributionF 1 =F 2 .

    All possible orderings of theseobservations with n 1 coming from F 1 and n 2 coming from F 2 are equally

    likely.

  • 7/30/2019 ams572_ch14

    40/133

    14. 2.1. Wilcoxon-Mann-Whitney Test

    Example: Find the null distribution of W1

    and U 1 when n 1 =2 and n 2 =2.

    Ranksw1 u1

    Null distn of W1and U1

    1 2 3 4 w1 u1 p

    x x y y 3 0 3 0 1/6

    x y x y 4 1 4 1 1/6

    x y y x 5 2 5 2 2/6

    y y x x 7 4 6 3 1/6y x y x 6 3 7 4 1/6

    y x x y 5 2

  • 7/30/2019 ams572_ch14

    41/133

    14. 2.2. Wilcoxon-Mann-Whitney Confidence Interval

    1 2

    1 2

    1 1 2 2

    1

    and distrbutions belong to a location parameter familywith location paramaters and , respectively.

    ( ) ( ) and ( ) ( )

    where F is a common unknown distribution function,

    F F

    F x F x F y F y

    2and are the respective population medians.

  • 7/30/2019 ams572_ch14

    42/133

    14. 2.2. Wilcoxon-Mann-Whitney Confidence Interval

    1 2, 1 2

    1 2

    1 2( 1) ( )

    Step 2

    Let be the lower /2 critical point

    of the null distribution of the statistics.

    Then a 100 1 )% CI for is given by

    ,n n /

    u N u

    u u

    U

    ( -

    d d

    1 2

    1 2

    1 2

    A CI for can be obtained by inverting the Mann-Whitney test.The procedure is as follows:

    Step 1

    Calculate all differences(1 , 1 )

    and rank them:

    ij i j

    N n nd x y i n j n

    (1) (2) ( )

    where ( )is the ordered values of the differences N

    ij i j

    d d d

    d i d x y

  • 7/30/2019 ams572_ch14

    43/133

    14. 2.2. Wilcoxon-Mann-Whitney Confidence Interval

    Example

    Find 95% CI for the difference between the median failure times of the control groupand thermally stressed group of capacitors the data from ex 14.7.

    1 2 1 28, 10, 8*10 80

    The lower 2.2% critical point of the distribution of U 17

    By symmetry the upper 2.2% critical point 80-17 63

    Setting / 2 0.022 , 1- 1 0.044 0.956

    95.6 % CI for the diffe

    n n N n n

    (18) (63)

    rence between the median failure times

    [ , ]

    where the ( ) are the ordered values of the differences

    Differences are calcucated in an array form in Table 14.7.Counting the 18th

    ij i j

    ij

    d d

    d i d x y

    d

    (18) (63)

    ordered differences from the lower and high ends.

    Therefore, 95.6% CI for the difference of the two medians

    [ , ] [-1.1, 14.7]d d

  • 7/30/2019 ams572_ch14

    44/133

    14. 2.2. Wilcoxon-Mann-Whitney Confidence Interval

    Table A.11 (pg. 684)

    n1 n2 u1=upper critical point(80-u1=lower critical point)

    P(W>w1)Upper Tail

    Probabilities

    8 10 59 (80-59=21) 0.051

    10 62 (80-62=18) 0.027

    10 63 (80-63=17) 0.022

    10 66 (80-66=14) 0.010

    10 68 (80-68=12) 0.006

  • 7/30/2019 ams572_ch14

    45/133

    14.3 Inferences for Several Independent Samples

    One-way layout experiment

    Completely Randomized Design

    Comparing a > 2 treatment. The available experimental units are

    randomly assigned to each treatment. No. of experimental units in different

    treatment groups does not have to besame.

    The data classified according to the level of a singletreatment factor.

  • 7/30/2019 ams572_ch14

    46/133

    Treatment

    1 2 a

    X1 1X1 2

    ::

    X1 n1

    X2 1X2 2

    ::

    X2 n2

    :::

    Xa 1 Xa 2

    ::

    Xa na

    Sample Median 1 1 a

    Sample SD S 1 S 2 S a

    14.3 Inferences for Several Independent Samples

    Example of One-way layout experiment

    Comparing effectiveness of different pillson migraine.

    Comparing duration of different tires.

    etc

  • 7/30/2019 ams572_ch14

    47/133

    14.3 Inferences for Several Independent Samples

    Assumption

    1. The data on the each treatmentform a random sample from a continuousc.d.f. F i.

    2. Random samples areindependent.

    3. Fi( y ) = F ( y

    i) ,

    where i is the locationof parameter of F i

    i = Median of F i

    F1

    F2

    F a

    1

    2

    a

  • 7/30/2019 ams572_ch14

    48/133

    14.3 Inferences for Several Independent Samples

    Hypothesis

    H0: F 1 = F 2 = = Fa H1: F i < F j for some i = j

    It can be changed to

    H0: 1 = 2 = = a

    H1: i > j for some i = j

    F1

    F2

    F a

    1

    2

    a

    Can we say that all Fis are the same?

  • 7/30/2019 ams572_ch14

    49/133

    14.3.1 Kruskal Wallis Test

    STEP 1:

    STEP 2:

    Rank all N = n i observations in ascendingorder. Assign mid-ranks in case of ties.

    r ij = rank (y ij)

    a i=1

    N = r ij = 1 + 2 + + N = N ( N + 1 )

    2

    E [ r ] =( N + 1 )

    2

    Calculate rank sums r i = r ij

    and averages r i = r i / n i, i = 1, 2, , a. j=1 n i

  • 7/30/2019 ams572_ch14

    50/133

    14.3.1 Kruskal Wallis Test

    STEP 3:

    STEP 4:

    Calculate the Kruskal-Wallis test statistic

    kw = n i ( r i - )12

    N ( N + 1 ) i=1

    a ( N + 1 )

    2

    2

    = - 3( N + 1 )12 N ( N + 1 ) i=1

    a ri n i

    2

    Reject H 0 for large value of kw .

    If nis are large, kw follows chi-square dist. witha-1 degrees of freedom.

  • 7/30/2019 ams572_ch14

    51/133

    14.3.1 Kruskal Wallis Test

    Example :

    NRMA, the worlds biggest car insurancecompany, has decided to test the durability of tires from 4 major companies.

  • 7/30/2019 ams572_ch14

    52/133

    14.3.1 Kruskal Wallis Test

    Example :

    Average Test ScoresDifferent Tires from 4 major co.

    14.5923.4425.4318.1520.8214.0614.26

    20.2726.8414.7122.3419.4924.9220.20

    27.8224.9228.6823.3232.8533.9023.42

    33.1626.9330.4336.4337.0429.7633.88

    Ranks of Average Test Scores

    313165912

    8174

    106

    14.57

    1914.52011232612

    24182227282125

    1

    2824.92

    24.92

    49 66.5 125.5 165

  • 7/30/2019 ams572_ch14

    53/133

    Example :

    14.3.1 Kruskal Wallis Test

  • 7/30/2019 ams572_ch14

    54/133

    Example :

    14.3.1 Kruskal Wallis Test

    kw = - 3( N + 1 )12

    N ( N + 1 ) i=1

    a ri

    n i

    2

    = + + + + 12

    28(29)

    (49)

    7

    (66.5)

    7

    (125.5)

    7

    (165)

    7

    22 22

    - 3(29)

    = 18.134

    [ ]

  • 7/30/2019 ams572_ch14

    55/133

    Example :

    14.3.1 Kruskal Wallis Test

    kw = 18.34 > X 3,.005 = 12.837

  • 7/30/2019 ams572_ch14

    56/133

    14.3.2 Pairwise Comparisons

    Comparing 2 groups among treatments

    H0: E ( R i Rj ) = 0 and

    Var( R i Rj ) = +N ( N + 1 )

    12

    1

    n i

    1

    n j ( )

    For large n is, R i Rj follows approximatelynormally distributed.

    z ij = r i - r jN ( N + 1 )

    12

    1

    n i

    1

    n j ( )+

  • 7/30/2019 ams572_ch14

    57/133

    14.3.2 Pairwise Comparisons

    To control the type I familywise error rateat level

    IzijI statistic should be referred to appropriate

    Studentized range distribution.

    Tukey Method ( Chapter. 12)

  • 7/30/2019 ams572_ch14

    58/133

    14.3.2 Pairwise Comparisons

    +N ( N + 1 )

    12

    1

    n i

    1

    n j ( )

    q a,,

    2I r i - r j I >

    q a,,

    2I z ij I > or

    No. of treatment group compared = a Degree of freedom =

    (assumption : sample is large)

    Compare with critical constant q a, ,. .

    14 3 2 P i i C i

  • 7/30/2019 ams572_ch14

    59/133

    Example :

    Ranks of Average Test ScoresDifferent Tires from 4 major co.

    313165912

    8174

    106

    14.57

    1914.52011232612

    24182227182125

    49 66.5 125.5 165

    14.3.2 Pairwise Comparison

    14 3 2 P i i C i

  • 7/30/2019 ams572_ch14

    60/133

    Example :

    14.3.2 Pairwise Comparison

    Let be .05.

    1

    7

    1

    7( )= +3.63

    2

    (28)(29)

    12= 11.29

    I r 1 r 4 I , I r 1 r 4 I > 11.29

    We differfrom

    GOODYEAR!!!

    +N ( N + 1 )

    12

    1

    n i

    1

    n j ( )q

    a,,

    2

    14 4 I f f S l M h d S l

  • 7/30/2019 ams572_ch14

    61/133

    14.4 Inferences for Several Matched Samples

    Randomized block design

    Friedman test

    treatment groups and blocks.2a 2b

    A distribution-free rank-based test for comparing thetreatments in the randomized block design

    HypothesisH0: F 1j = F 2j = = F aj H1: F ij < F kj for some i = k

    It can be changed toH0: 1 = 2 = = a H1: i > k for some i = k

    14 4 1 F i d T t

  • 7/30/2019 ams572_ch14

    62/133

    14.4.1 Friedman Test

    STEP 1:

    STEP 2:

    Rank all N = n i observations in ascendingorder. Assign mid-ranks in case of ties.

    r ij = rank (y ij)

    a i=1

    Calculate rank sums r i = r ij , i = 1, 2, , a. j=1 b

    14 4 1 F i d T t

  • 7/30/2019 ams572_ch14

    63/133

    14.4.1 Friedman Test

    STEP 3:

    STEP 4:

    Calculate the Friedman test statistic

    fr = ( r i - )12

    ab ( a + 1 ) i=1

    a b ( a + 1 )

    2

    2

    = - 3b( a + 1 )12 ab ( a + 1 ) i=1

    a ri

    2

    Reject H 0 for large value of fr .

    If nis are large, fr follows chi-square dist. witha-1 degrees of freedom.

    14 4 1 F i d T t

  • 7/30/2019 ams572_ch14

    64/133

    14.4.1 Friedman Test

    Example :

    Drip loss in Meat Loaves

    OvenPosition

    Batch Ranksum1 Rank 2 Rank 3 Rank

    1

    2345

    678

    7.33

    3.223.286.443.83

    3.285.064.44

    8

    12.574

    2.565

    8.11

    3.725.115.786.50

    5.115.114.28

    8

    1467

    442

    8.06

    4.284.568.617.72

    5.567.836.33

    7

    1285

    364

    23

    38.52116

    9.51611

    14 4 1 Friedman Test

  • 7/30/2019 ams572_ch14

    65/133

    14.4.1 Friedman Test

    Example : Friedman test statistic equals

    fr = - 3b( a + 1 )12

    ab ( a + 1 ) i=1 a

    ri 2

    =

    12

    8*3*9 [ 23 +3 +8.5 +21 +16 +9.5 +16 +11 ] 3*3*9

    2 2 2 2 2 2 22

    = 17.583 > = 16.01227,.025

    significant differences between theoven positions

    However, No. of blocks is only 3; the largesample chi-square approximation may not be

    accurate.

    14 4 2 Pairwise Comparisons

  • 7/30/2019 ams572_ch14

    66/133

    14.4.2 Pairwise Comparisons

    Comparing 2 groups among treatments

    H0: E ( R i Rj ) =0 and

    Var( R i Rj ) =a ( a + 1 )

    6b

    As in the case of the Kruskal-Wallis test, i and j canbe declared different at significance level if

    , , ( 1)62

    a ai j

    q a ar r b

    14 5 Rank Correlation Methods

  • 7/30/2019 ams572_ch14

    67/133

    14.5 Rank Correlation Methods

    .

    What is Correlation?

    Correlation indicates the strength and directionof a linear relationship between two random

    variables.

    In general statistical usage, correlation to thedeparture of two variables from independence.

    Correlation does not imply causation.

    14 5 1 Spearmans Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    68/133

    14.5.1 Spearman s Rank Correlation Coefficient

    .

    Charles Edward Spearman

    Born September 10, 1863

    Died September 7, 1945(82 years old)

    An English psychologistknown for work in statistics, as a pioneer of factor analysis, and for Spearman's rankcorrelation coefficient.

    BTW, he looks like SeanConnery

    14 5 1 Spearmans Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    69/133

    Yearly alcohol consumption from wine

    Yearly heart disease (Per 100,000)

    19 Country Study

    14.5.1 Spearman s Rank Correlation Coefficient

    .

    What are we correlating?

    DA

    X i Y i U i V i

    No. Country Alcoholf Wi

    HeartDisease Rank X Rank Y D i

  • 7/30/2019 ams572_ch14

    70/133

    ATA

    y from Wine Deathsi

    1 Australia 2.5 211 11 12.5 -1.5

    2 Austria 3.9 167 15 6.5 8.5

    3 Belgium 2.9 131 13.5 5 8.5

    4 Canada 2.4 191 10 9 1

    5 Denmark 2.9 220 13.5 14 -0.5

    6 Finaland 0.8 297 3 18 -15

    7 France 9.1 71 19 1 18

    8 Iceland 0.8 211 3 12.5 -9.5

    9 Ireland 0.7 300 1 19 -18

    10 Italy 7.9 107 18 3 15

    11 Netherlands 1.8 167 8 6.5 1.5

    12 New Zealand 1.9 266 9 16 -7

    13 Norway 0.8 227 3 15 -12

    14 Spain 6.5 86 17 2 15

    15 Sweden 1.6 207 7 11 -4

    16 Switzerland 5.8 115 16 4 12

    17 UK 1.3 285 6 17 -11

    18 US 1.2 199 5 10 -5

    19 W. Germany 2.7 172 12 8 4

    14 5 1 Spearmans Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    71/133

    14.5.1 Spearman s Rank Correlation Coefficient

    Spearmans Rank Correlation Coefficient

    A nonparametric (distribution-free) rank statisticproposed in 1904 as a measure of the strength

    of the associations between two variables.

    The Spearman rank correlation coefficient canbe used to give a measure of monotone

    association that is used when the distribution of the data make Pearson's correlation coefficientundesirable.

    14 5 1 Spearmans Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    72/133

    1

    2 2

    1 1

    ( )( )

    ( ( ) )( ( ) )

    n

    i ii

    s n n

    i ii i

    u u v vr

    u u v v

    2

    12

    61

    ( 1)

    n

    ii

    s

    d r

    n n

    If Di is integer then:

    14.5.1 Spearman s Rank Correlation Coefficient

    Relevant Formulas

    14 5 1 Spearmans Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    73/133

    2

    12

    61

    ( 1)

    n

    ii

    s

    d r

    n n

    (6)(2081.5)1 0.826(19)(360) s

    r

    14.5.1 Spearman s Rank Correlation Coefficient

    Examples

    From previous data we calculate:

    14 5 1 Spearmans Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    74/133

    14.5.1 Spearman s Rank Correlation Coefficient

    Hypothesis Testing Using Spearman

    Ho: X and Y are independent

    Ha : X and Y are positivelyassociated

    14 5 1 Spearmans Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    75/133

    14.5.1 Spearman s Rank Correlation Coefficient

    For large values of N (> 10) is approximatedby the normal distribution with a mean

    ( ) 0 s E R1( )

    1 sVar R

    n

    1 s z r n Using the test statistic:

    14 5 1 Spearmans Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    76/133

    14.5.1 Spearman s Rank Correlation Coefficient

    Examples

    From previous data we calculate:

    0.826 18 3.504 z 1 s z r n

    P-value = 0.0004

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    77/133

    14.5.2 Kendall s Rank Correlation Coefficient

    born September 6, 1907 died March 29, 1983 (76 years

    old) Maurice Kendall was born in

    Kettering, North Hampton shire He studied mathematics at St.

    John's College, Cambridge,where he played cricket andchess

    After graduation as aMathematics Wrangler in 1929,he joined the British Civil Service

    in the Ministry of Agriculture. Inthis position he becameincreasingly interested in usingstatistics.

    Developed the rank correlation

    coefficient in 1948.

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    78/133

    14.5.2 Kendall s Rank Correlation Coefficient

    Kendalls Rank Correlation Coefficient

    A pair of Bivariate random variables( , )i i X Y ( , ) j j X Y

    ( )( ) 0i j i j X X Y Y Which implies:

    i j X X ANDor

    AND

    i jY Y

    i j X X i jY Y

    Concordant:

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    79/133

    14.5.2 Kendall s Rank Correlation Coefficient

    Kendalls Rank Correlation Coefficient Discordant:

    ( )( ) 0i j i j X X Y Y

    i j X X AND

    or AND

    i jY Y i j X X i j

    Y Y

    Which implies:

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    80/133

    14.5.2 Kendall s Rank Correlation Coefficient

    Kendalls Rank Correlation Coefficient Tied Pair:

    OR

    OR BOTH

    Which implies:

    ( )( ) 0i j i j X X Y Y

    i j X X i jY Y

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    81/133

    Relevant Formula

    ( ) (( )( ) 0)c i j i j P Concordant P X X Y Y

    ( ) (( )( ) 0)d i j i j P Discordant P X X Y Y

    c d

    1 1

    c d

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    82/133

    Relevant Formula

    Nc

    = # of Concordant PairsNd = # of Discordant Pairs

    c d N N

    N

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    83/133

    Formula Continued

    1 2

    g j

    x j

    aT

    1 2

    h j

    y j

    bT

    ( )( )c d

    x y

    N N

    N T N T

    If there are ties the formula is modified:

    Suppose there are g groups of tied X is with a j tiedobservations in the j th group and h groups of tied

    Yis with b j tied observations in the j th group.

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    84/133

    Formula Explanation Five pairs of observations:(x,y)=(1,3)(1,4)(1,5)(2,5)(3,4)

    There is g=1 group of a 1=3 tied

    xs equal to 1 and there are h=2groups of tied ys Group 1 has b 1=2 tied ys qual to

    4 and group 2 has b 2=2 tied ysequal to 5.

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    85/133

    Formula Example continued

    3 23 1 4

    2 2 xT

    2 2 1 1 22 2 y

    T

    Data

  • 7/30/2019 ams572_ch14

    86/133

    i Country X i Yi Nci Ndi Nti 1 Ireland 0.7 300 0 18 02 Iceland 0.8 211 3 11 33 Norway 0.8 227 2 13 14 Finland 0.8 297 0 15 05 US 1.2 199 5 9 06 UK 1.3 285 0 13 07 Sweden 1.6 207 3 9 08 Netherlands 1.8 167 5 5 19 New Zealand 1.9 266 0 10 0

    10 Canada 2.4 191 2 7 011 Australia 2.5 211 1 7 0

    12West

    Germany 2.7 172 1 6 013 Belgium 2.9 131 2 4 014 Denmark 2.9 220 0 5 015 Austria 3.9 167 0 4 016 Switzerland 5.8 115 0 3 017 Spain 6.5 86 1 1 018 Italy 7.9 107 0 1 019 France 9.1 71 0 0 0

    Nc=25 N d=141 N t=5

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    87/133

    25 1410.690(171 4)(171 2)

    Testing Example

    ( )( )c d

    x y

    N N

    N T N T

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    88/133

    Hypothesis Testing

    0 : 0

    : 0a

    H

    H

    ( ) 0 E 2(2 5)

    ( )9 ( 1)

    nVar

    n n

    9 ( 1)2(2 5)

    n n z

    n

    14.5.2 Kendalls Rank Correlation Coefficient

  • 7/30/2019 ams572_ch14

    89/133

    25 1410.690

    (171 4)(171 2)

    (9)(19)(18)0.690 4.128

    2(43) z

    P-value < 0.0001

    Testing Example

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    90/133

    Kendalls Coefficient of Concordance Q: Why do we need Kendalls Coefficient of

    Concordance ? A: It is a measure of association between several

    matched samples.

    Q: Why not use Kendalls Rank CorrelationCoefficient instead?

    A: Because its only works for two samples.

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    91/133

    How can you apply this to real life?

    A common & interesting example:

    A taste-testing experiment used four tasters torank eight recipes with the following results. Arethe tasters in agreement??Hmm lets find out!

    Kendalls Coefficient of Concordance

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    92/133

    Taster RankRecipe 1 2 3 4 Sum

    1 5 4 5 4 18 2 7 5 7 5 24 3 1 2 1 3 7 4 3 3 2 1 9 5 4 6 4 6 20 6 2 1 3 2 8 7 8 7 8 8 31 8 6 8 6 7 27

    Kendalls Coefficient of Concordance

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    93/133

    How does it work?

    It is closely related to Freidmans test statistic(mentioned in 14.4).

    The a treatments are candidates(recipes) .

    The b blocks are judges (Tasters) .

    Each judge ranks the a candidates .

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    94/133

    Kendalls Coefficient of Concordance The discrepancy of the actual rank sums under

    perfect disagreement as defined by:

    Is a measure of agreement between the judges

    2

    1

    ( 1)2

    a

    ii

    b ad r

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    95/133

    The maximum value of this measure isattained when there is perfect agreement :

    It is given by:

    2 2 2

    max1

    ( 1) ( 1)2 12

    a

    i

    b a b a ad ib

    Kendalls Coefficient of Concordance

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    96/133

    Kendalls Coefficient of Concordance

    Kendallsw statistic : Is an estimate of the variance of the row sums

    of ranks Ri divided by the maximum possible

    value the variance can take:

    This occurs when all judges are in agreement. Hence;

    2

    2 21max

    12 ( 1)( 1) 2

    a

    ii

    d b aw r

    d b a a

    0 1w

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    97/133

    Kendalls Coefficient of Concordance What relationship does w and fr,

    Freidmans statistic have?

    Does the Kendalls w statistic relate to theSpearmans rank correlation coefficient?

    only when a=2 :

    ( 1)

    fr w

    b a

    2 1 sr w

    14.5.3 Kendalls Coefficient of Concordance

  • 7/30/2019 ams572_ch14

    98/133

    Kendalls Coefficient of Concordance

    Q: How can we perform statistical tests?

    What distribution does it follow?

    In order to perform a test on w for statisticalsignificance:

    Use chi-square distribution. Use (n-1) degrees of freedom.

    2

    ( )

    Kendalls Coefficient of

  • 7/30/2019 ams572_ch14

    99/133

    Kendall s Coefficient of Concordance

    In order to find out whether or not tasters arein agreement, we calculate the Kendallscoefficient of concordance.

    Freidmans statistic: fr=24.667 Therefore, w = 24.667/ (4)(8)= 0.881 , Comparing fr=24.667 with =14.067,

    since fr exceeds this critical value we conclude that tasters agree .

    27,.05

    14.6.1 Permutation Tests

  • 7/30/2019 ams572_ch14

    100/133

    Permutation Test1) General Idea

    A permutation test is a type of statistical significance test inwhich a reference distribution is obtained by calculating allpossible values of the test statistic under rearrangements thelabels on the observed data points. Confidence intervals canbe derived from the tests.

    2) Inventor The theory has evolved fromthe works of R.A. Fisher andE.J.G. Pitman in the 1930s.

    14.6.1 Permutation Tests

  • 7/30/2019 ams572_ch14

    101/133

    Major Theory & DerivationThe permutation test finds a p-value as the proportion of

    regroupings that would lead to a test statistic as extreme asthe one observed. Well consider the permutation test based on sample averages, although one could computing andcomparing other test statistics

    We have two samples that we with to compare

    Hypotheses:Ho: differences between two samples are due to chance

    Ha: sample 2 tends to have higher values than sample 1 notdue to simply to chanceHa: sample 2 tends to have smaller values than sample 1, not

    due simply to chanceHa: there are differences between the two samples not just to

    chance

    14.6.1 Permutation Tests

  • 7/30/2019 ams572_ch14

    102/133

    To See if the observed difference d from our data

    supports Ho or one of the selected alternatives, dothe following steps of a Permutation Test:

  • 7/30/2019 ams572_ch14

    103/133

    Ms. Merry Huilin

    Ma~ ^^*

    14.6.2 Bootstrap Method

  • 7/30/2019 ams572_ch14

    104/133

    Bootstrapping is a statistical method for estimating thesampling distribution of an estimator by sampling withreplacement from the original sample, most often with thepurpose of deriving robust estimates of standard errors andconfidence intervals of a population parameter like a mean,median, proportion, odds ratio, correlation coefficient or regression coefficient.

    1) General Idea

    2) Inventor

    Homepage: http://stat.stanford.edu/~brad/ E-mail: [email protected]

    Bradley Efron(1938-present)'s work has spanned boththeoretical and applied topics, including empirical Bayes

    analysis, applications of differential geometry to statisticalinference, the analysis of survival data, and inference for microarray gene expression data.

    14.6.2 Bootstrap Method

    http://stat.stanford.edu/~brad/mailto:[email protected]:[email protected]://stat.stanford.edu/~brad/
  • 7/30/2019 ams572_ch14

    105/133

    3) Major Theory and Derivation

    Consider the cases where a random sample of size n is drawn from anunspecified probability distribution, The basic steps in the bootstrapprocedure are following

    14.6.3 Jackknife Method

  • 7/30/2019 ams572_ch14

    106/133

    Jackknife is a statistical method for estimating andcompensating for bias and for deriving robust estimates of standard errors and confidence intervals. Jackknifed statisticsare created by systematically dropping out subsets of dataone at a time and assessing the resulting variation in thestudied parameter.

    1) General Idea

    2) Inventor Richard Edler von Mises(1883 - 1953)was a scientist who worked on practicalanalysis, integral and differentialequations, mechanics, hydrodynamicsand aerodynamics, constructive geometry,probability calculus, statistics andphilosophy.

    14.6.3 Jackknife Method

  • 7/30/2019 ams572_ch14

    107/133

    3) Major Theory and Derivation

    Now we briefly describe how it is possible to obtain the standarddeviation of a generic estimator using the Jackknife method. For simplicity we consider the average estimator. Let us consider thevariables:

    where X is the sample average. X (i) is the sample average of the data set deleting the i th point. Then we can define the

    average of x(i) :

    The jackknife estimate of standard deviation is then defined as:

    SAS program%macro _SASTASK_DROPDS(dsname);

    %IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;DROP TABLE &dsname;

    %END;

  • 7/30/2019 ams572_ch14

    108/133

    ;%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;

    DROP VIEW &dsname;%END;

    %mend _SASTASK_DROPDS;

    %LET _EGCHARTWIDTH=0;%LET _EGCHARTHEIGHT=0;

    PROC SQL ;% _S A STA SK _DRO PD S (WORK.SORTTempTableSorted);QUIT ;

    PROC SQL ;

    CREATE VIEW WORK.SORTTempTableSorted AS SELECT ScoreChange FROM MIHIR.AMS572;

    QUIT ;TITLE;TITLE1 "Distribution analysis of: ScoreChange";Title2 " Wilcoxon Rank Sum Test";

    ODS EXCLUDE CIBASIC BASICMEASURES EXTREMEOBS MODES MOMENTS QUANTILES;PROC UNIVARIATE DATA = WORK.SORTTempTableSorted

    MU0= 0 ;

    VAR ScoreChange;HISTOGRAM / NOPLOT ;

    RUN ; QUIT ;PROC SQL ;% _S A STA SK _DRO PD S (WORK.SORTTempTableSorted);QUIT ;

    SAS program

  • 7/30/2019 ams572_ch14

    109/133

    Distribution analysis of: ScoreChange Wilcoxon RankSum Test

    The UNIVARIATE ProcedureVariable: ScoreChange (Change in Test Scores)

    Tests for Location: Mu0=0

    Test Statistic p Value

    Student's t t -0.80079 Pr > |t| 0.4402

    Sign M -1 Pr >= |M| 0.7744

    Signed Rank S -8.5 Pr >= |S| 0.5278

    /**Kruskal-Wallis Test and Wilcoxon-Mann-Whitney Test **/%macro _SASTASK_DROPDS(dsname);

    %IF %SYSFUNC(EXIST(&d )) %THEN %DO

    SAS program

  • 7/30/2019 ams572_ch14

    110/133

    %IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;DROP TABLE &dsname;

    %END;%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;

    DROP VIEW &dsname;%END;

    %mend _SASTASK_DROPDS;

    %LET _EGCHARTWIDTH=0;%LET _EGCHARTHEIGHT=0;

    PROC SQL ;% _S A STA SK _DRO PD S (WORK.TMP0TempTableInput);QUIT ;

    PROC SQL ;CREATE VIEW WORK.TMP0TempTableInput

    AS SELECT PreTest, Gender FROM MIHIR.AMS572;QUIT ;

    TITLE;TITLE1 "Nonparametric One-Way ANOVA";

    PROC NPAR1WAY DATA=WORK.TMP0TempTableInput WILCOXON;

    VAR PreTest;CLASS Gender;

    RUN ; QUIT ;PROC SQL ;% _S A STA SK _DRO PD S (WORK.TMP0TempTableInput);

    QUIT ;

    Nonparametric One-Way ANOVA

    The NPAR1WAY ProcedureSAS program

  • 7/30/2019 ams572_ch14

    111/133

    Wilcoxon Scores (Rank Sums) for Variable PreTestClassified by Variable Gender

    Gender N Sum ofScores

    ExpectedUnder H0

    Std DevUnder H0

    MeanScore

    F 7 40.0 45.50 6.146877 5.714286

    M 5 38.0 32.50 6.146877 7.600000

    Average scores were used for ties.

    Wilcoxon Two-Sample TestStatistic 38.0000

    Normal Approximation

    Z 0.8134

    One-Sided Pr > Z 0.2080

    Two-Sided Pr > |Z| 0.4160

    t Approximation

    One-Sided Pr > Z 0.2166

    Two-Sided Pr > |Z| 0.4332

    Z includes a continuity correction

    of 0.5.

    Kruskal-Wallis Test

    Chi-Square 0.8006

    DF 1

    Pr > Chi-Square 0.3709

    /** Wilcoxon Signed Rank Test **/%macro _SASTASK_DROPDS(dsname);

    %IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;DROP TABLE &dsname;

    SAS program

  • 7/30/2019 ams572_ch14

    112/133

    DROP TABLE &dsname;%END;%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;

    DROP VIEW &dsname;%END;

    %mend _SASTASK_DROPDS;

    %LET _EGCHARTWIDTH=0;%LET _EGCHARTHEIGHT=0;

    PROC SQL ;% _S A STA SK _D ROP DS (WORK.SORTTempTableSorted);QUIT ;

    PROC SQL ;CREATE VIEW WORK.SORTTempTableSorted

    AS SELECT ScoreChange FROM MIHIR.AMS572;QUIT ;TITLE;TITLE1 "Distribution analysis of: ScoreChange";TITLE2 "Wilcoxon Signed Rank Test";

    ODS EXCLUDE CIBASIC BASICMEASURES EXTREMEOBS MODES MOMENTS QUANTILES;PROC UNIVARIATE DATA = WORK.SORTTempTableSorted

    MU0= 0 ;

    VAR ScoreChange;HISTOGRAM / NOPLOT ;

    RUN ; QUIT ;PROC SQL ;% _S A STA SK _D ROP DS (WORK.SORTTempTableSorted);

    QUIT ;

  • 7/30/2019 ams572_ch14

    113/133

    /**Friedman Test **/%macro _SASTASK_DROPDS(dsname);

    %IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;DROP TABLE &dsname;

    SAS program

  • 7/30/2019 ams572_ch14

    114/133

    %END;%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;

    DROP VIEW &dsname;%END;

    %mend _SASTASK_DROPDS;%LET _EGCHARTWIDTH=0;%LET _EGCHARTHEIGHT=0;

    PROC SQL ;% _S A STA SK _DRO PD S (WORK.SORTTempTableSorted);QUIT ;

    PROC SQL ;

    CREATE VIEW WORK.SORTTempTableSorted AS SELECT Emotion, Subject, SkinResponse FROM WORK.HYPNOSIS1493;QUIT ;TITLE; TITLE1 "Table Analysis";

    TITLE2 "Results";

    PROC FREQ DATA = WORK.SORTTempTableSortedORDER=INTERNAL

    ;

    TABLES Subject * Emotion * SkinResponse /NOROWNOPERCENTNOCUMCMH SCORES=RANK

    ALPHA= 0.05 ;RUN ; QUIT ;PROC SQL ;% _S A STA SK _DRO PD S (WORK.SORTTempTableSorted);QUIT ;

    SAS program

  • 7/30/2019 ams572_ch14

    115/133

    Table Analysis Results:

    The FREQ Procedure:Summary Statistics for Emotion by SkinResponse

    Controlling for Subject

    Cochran-Mantel-Haenszel Statistics (Based on Rank Scores)

    Statistic Alternative Hypothesis DF Value Prob

    1 Nonzero Correlation 1 0.2400 0.6242

    2 Row Mean Scores Differ 3 6.4500 0.0917

    3 General Association 84 . .

    At least 1 statistic not computed--singular covariance matrix.

    Total Sample Size = 32

    /*Spearman correlation*/%macro _SASTASK_DROPDS(dsname);

    %IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;DROP TABLE &dsname;

    SAS program

  • 7/30/2019 ams572_ch14

    116/133

    ;%END;%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;

    DROP VIEW &dsname;%END;

    %mend _SASTASK_DROPDS;

    %LET _EGCHARTWIDTH=0;%LET _EGCHARTHEIGHT=0;

    PROC SQL ;% _S A STA SK _D RO PDS (WORK.SORTTempTableSorted);QUIT ;

    PROC SQL ;CREATE VIEW WORK.SORTTempTableSorted

    AS SELECT Arts, Economics FROM WORK.WESTERNRATES5171;QUIT ;

    TITLE1 "Correlation Analysis";

    /*Sperman Method*/

    PROC CORR DATA=WORK.SORTTempTableSortedSPEARMANVARDEF=DFNOSIMPLENOPROB;VAR Arts;WITH Economics;

    RUN ;

    /*Kendall Method */SAS program

  • 7/30/2019 ams572_ch14

    117/133

    PROC CORR DATA=WORK.SORTTempTableSortedKENDALLVARDEF=DF

    NOSIMPLENOPROB;VAR Arts;WITH Economics;

    RUN ;

    RUN ; QUIT ;

    PROC SQL ;% _S A STASK _D ROPDS (WORK.SORTTempTableSorted);

    QUIT ;

    SAS program

  • 7/30/2019 ams572_ch14

    118/133

    Correlation Analysis

    The CORR Procedure

    1 With Variables: Economics

    1 Variables: Arts

    Spearman Correlation Coefficients, N = 52

    Arts

    Economics 0.27926

    1 With Variables: Economics

    1 Variables: Arts

    Kendall Taub Correlation Coefficients, N = 52

    Arts

    Economics 0.18854

    Correlation Analysis

    The CORR Procedure

  • 7/30/2019 ams572_ch14

    119/133

    Whathappened to

    I dontreally

  • 7/30/2019 ams572_ch14

    120/133

    pphis eyes!!!!! believe in

    peace

  • 7/30/2019 ams572_ch14

    121/133

    buddies

    Statistics is funny!

  • 7/30/2019 ams572_ch14

    122/133

    y

    How?

    They are going tokill me HELP!

  • 7/30/2019 ams572_ch14

    123/133

    kill me. HELP!

    Are you

  • 7/30/2019 ams572_ch14

    124/133

    Are youstill taking

    the picture?

    Is it safe toI dont know but I

    am looking

  • 7/30/2019 ams572_ch14

    125/133

    look at thecamera?

    am looking.

  • 7/30/2019 ams572_ch14

    126/133

    We lovestatistics

    Losers!

  • 7/30/2019 ams572_ch14

    127/133

  • 7/30/2019 ams572_ch14

    128/133

  • 7/30/2019 ams572_ch14

    129/133

  • 7/30/2019 ams572_ch14

    130/133

  • 7/30/2019 ams572_ch14

    131/133

  • 7/30/2019 ams572_ch14

    132/133

  • 7/30/2019 ams572_ch14

    133/133