28
1 12. 12. Nonparametric Nonparametric test based on ranks test based on ranks Jin-Xin Zhang 张张张张 () School of Public Health Sun Yat-sen University 2009.11.02

Chapter 6 Ranksumtest

Embed Size (px)

Citation preview

Page 1: Chapter 6 Ranksumtest

1

12.12. Nonparametric test Nonparametric test based on ranksbased on ranks

Jin-Xin Zhang (张晋昕)

School of Public Health

Sun Yat-sen University

2009.11.02

Page 2: Chapter 6 Ranksumtest

2

A large scale survey reported that the mean of pulse

s for healthy males is 72 bpm. A physician randoml

y selected 25 healthy males in a mountainous area a

nd measured their pulses, resulting in a sample mea

n of 75.2 bpm and a standard deviation of 6.5 bpm.

Can one conclude that the mean of pulses for health

y males in the mountainous area is higher than that i

n the general population( μ>μ0) ?

Page 3: Chapter 6 Ranksumtest

3

nS

xt

/0

462.225/5.6

0.722.75

Page 4: Chapter 6 Ranksumtest

405.0P

SupportingArea

RejectionArea

RejectionArea

Page 5: Chapter 6 Ranksumtest

5

Parametric TestParametric Test

The methods of hypothesis testing we have learnt

(1) Assume: the variable follows a normal distribution;

(2) To test whether the means (parameters) are equal or not under such an assumption.

Therefore, they are called parametric tests .

Page 6: Chapter 6 Ranksumtest

6

Non-parametric tests Non-parametric tests (distribution-free tests)(distribution-free tests)

There aren’t any assumptions about the distribution.

Chi-square test ( chapter. 6 ) is a kind of non-parametric test.

Rank sum tests: Another kind of non-

parametric test, which is based on ranks of the data.

Page 7: Chapter 6 Ranksumtest

7

Under the following situations, the non-parametric tests could be used:

a. The distribution of data is unknown;

b. The distribution of data is skew;

c. Ranked data or non-precise data;

d. A quick and brief analysis ( for pilot study ).

Page 8: Chapter 6 Ranksumtest

8

It is suitable for a variety of data:Measurement or enumeration or ordinalNormal distribution or notSymmetric or not

However,

If the data are suitable for parametric tests,

the power of non-parametric test (if it is used)

will be slightly lower.

Page 9: Chapter 6 Ranksumtest

9

12.1 Wilcoxon’s signed rank sum test 12.1 Wilcoxon’s signed rank sum test (matched pairs)(matched pairs)

Example 12-1 In order to study the difference of intelligence between twin brothers, the intelligence scores of 12 pairs of twin brothers were measured. The results are listed in Table 12.2.

Page 10: Chapter 6 Ranksumtest

10

Table 12.2 Intelligence scores of 12 pairs of twin brothers Pair No.

i Senior

ix Junior

iy Difference

iii yxd Rank for || id

iR Rank with sign

iR

(1) (2) (3) (4) (5) (6) 1 86 88 2 3 3 2 71 77 6 7 7 3 77 76 -1 1.5 -1.5 4 68 64 -4 4 -4 5 91 96 5 5.5 5.5 6 72 72 0 - - 7 77 65 -12 10 -10 8 91 90 -1 1.5 -1.5 9 70 65 -5 5.5 -5.5

10 71 80 9 9 9 11 88 81 -7 8 -8 12 87 72 -15 11 -11

T +=24.5; T -=41.5

Page 11: Chapter 6 Ranksumtest

11

Steps:(1) Hypotheses:

H0: The median of the difference is 0

H1: The median of the difference is not 0 α=0.05.(2) Difference(3) Ranking absolute differences (omit zero) and give back the signs(4) Rank sum and statistic T = min {positive sum, negative sum}(5) P-value and conclusion

From Table 10, T is in 10-56, P>0.05, H0 is not rejected. Conclusion: The intelligence score are at the same level.

Page 12: Chapter 6 Ranksumtest

12

12.2 Wilcoxon’s rank sum test for two sampl12.2 Wilcoxon’s rank sum test for two sampleses

Two independent samples;

it is not a normal distribution,

or it is not sure whether the variable

follows a normal distribution .

Page 13: Chapter 6 Ranksumtest

13

Table 12.4 Survival time of cats and rabbits without oxygen Cats Rabbits

Survival time Ranks Survival time Ranks 25 9.5 15 1.5 34 15 15 1.5 44 17 16 3 46 18.5 17 4 46 18.5 19 5

21 6.5 21 6.5 23 8 25 9.5 27 11 28 12.5 28 12.5 30 14 35 16

n1=5 1R =78.5 n2=14 2R =111.5

Page 14: Chapter 6 Ranksumtest

14

(1)Hypotheses:

H0: The distributions of two populations are same

H1: The two distributions are not same α = 0.05(2) Ranking all the observations in two samples. If same values appear in (tie), give a mean rank. “25” in both sample, and the ranks should be 9

and 10, so that (9+10)/2= 9.5 for each.

(3) Rank sum for smaller sample, T=T1= 78.5(4) P-value and conclusion (Table 11 )

T0.05,5,9=28~72, T is outside the range, P<0.05. The difference is of statistical significance

between two animals.

Page 15: Chapter 6 Ranksumtest

15

12.3.1 Kruskal-Wallis’ H test 12.3.1 Kruskal-Wallis’ H test for comparing more than 2 samplesfor comparing more than 2 samples

Example 12.3 14 newborn infants were grouped into 4 categories according to their mother’s smoking habit.

A: smoking more than 20 cigarettes per day;

B: smoking less than 20 cigarettes per day;

C: ex-smoker;

D: never smoking.

Their weights are listed in Table 12.7.

Page 16: Chapter 6 Ranksumtest

16

Table 12.7 The weights of newborn infants grouped by their mothers’ smoking habit

Weight ijx Rank ijr A B C D A B C D 2.7 2.9 3.3 3.5 3 4 7 11 2.4 3.2 3.6 3.6 2 5.5 12.5 12.5 2.2 3.2 3.4 3.7 1 5.5 9 14 3.4 3.4 9 9 in 4 3 4 3 iR 15 15 37.5 37.5

Page 17: Chapter 6 Ranksumtest

17

(1)Hypothesis:

H0: The distributions of three populations are all same

H1: The distributions of three populations are not all same

α = 0.05

(2) Ranking all the observations in three samples

(Same way for ties)

(3) Rank sums for each sample

R1=R2=15, R3=R4=37.5

Page 18: Chapter 6 Ranksumtest

18

(4) Statistic H

If there is no tie

If there are ties

tj : Number of individuals in j-th tie

Example 12.7:

)1(3)1(

12 2

Nn

R

NNH

i

i

NN

ttC jj

3

3 )(1

C

HHC

9868.01414

)22()33()22(1 3

333

C 500.99868.0375.9 CH

375.9)114(3)3

5.37

4

5.37

3

15

4

15(

)114(14

12 2222

H

Page 19: Chapter 6 Ranksumtest

19

(5) (5) PP-value and conclusion-value and conclusion

—— —— Compare with critical value of Compare with critical value of HH ( (C 7C 7) )

oror

kk: Number of samples: Number of samples

Example 12.7:Example 12.7:

Conclusion: The weights are not all at an equal Conclusion: The weights are not all at an equal level.level.

21, k

815.723,05.0

Page 20: Chapter 6 Ranksumtest

20

12.3.2 Friedman test for the data from a 12.3.2 Friedman test for the data from a randomized block designrandomized block design

Example 12.4 The riboflavin were tested for 3 samples of cabbage under four test conditions (A, B, C and D). The results are listed in Table 12.9. Now the question is if the test results are different in different kinds of test conditions.

Page 21: Chapter 6 Ranksumtest

21

Table 12.9 The Riboflavin in cabbages ( g /g) Test conditions

Sample A B C D

1 27.2 24.6 39.5 38.6 2 23.2 24.2 43.1 39.5 3 24.8 22.2 45.2 33.0 Rj 5 4 12 9

Page 22: Chapter 6 Ranksumtest

22

Table 12.9 The Riboflavin in cabbages ( g /g) Test conditions

Sample A B C D

1 27.2(2) 24.6(1) 39.5(4) 38.6(3) 2 23.2(1) 24.2(2) 43.1(4) 39.5(3) 3 24.8(2) 22.2(1) 45.2(4) 33.0(3) Rj 5 4 12 9

Page 23: Chapter 6 Ranksumtest

23

)1(3

)1(

12

12/)1(

2/)1(

1

2

1

2

2

kbRkbkkkb

kbR k

jj

k

j

j

2.8)14)(3(3)91245()14)(4(3

12 2222

Since 815.7,32

05.0 <8.2 and 05.0P , 0H is rejected.

We conclude that the results under different conditions

may have different levels of readings.

Page 24: Chapter 6 Ranksumtest

24

12.3.3. multiple comparison of mean ranks12.3.3. multiple comparison of mean ranks

When the comparison among four groups results in significant differences, multiple comparison is needed to know who and who are different. Z tests for pair-wise comparison could be used.

H0: The location of population A and B are different

H1: The location of population A and B are not different α = 0.05

ji RR

jiij

RRZ

jiRR nn

nnji

11

12

)1(2

Page 25: Chapter 6 Ranksumtest

25

Table 12.7 The weights and ranks of newborn infants grouped by their mothers’ smoking habit

Weight ijx Rank ijr A B C D A B C D 2.7 2.9 3.3 3.5 3 4 7 11 2.4 3.2 3.6 3.6 2 5.5 12.5 12.5 2.2 3.2 3.4 3.7 1 5.5 9 14 3.4 3.4 9 9 in 4 3 4 3 iR 15 15 37.5 37.5

75.34/151 R 00.53/152 R

38.94/5.373 R 50.123/5.374 R

Page 26: Chapter 6 Ranksumtest

26

(1)Hypothesis:

H0: this pair of two population distributions have the same location

H1: this pair of two population distributions have different locations,

α=0.05.

(2) Calculate Z value:

74.2

)3

1

4

1(

12

)114(14

50.1275.3

)11

(12

)1(

4

414,1

nn

nn

RRZ

i

Page 27: Chapter 6 Ranksumtest

27

(3) Decide P value,

Weights in first group has a different level from that of fourth group. Since , The mothers who smoke may have babies with lower weights.

Conclusion: Smoking may lead to the newborn’s lower weights.

0167.0,12.274.2 PZ

0167.03

05.0* c

12.20167.0 Z

41 RR

Page 28: Chapter 6 Ranksumtest

28The End