12
Resampling Applications & Permutation Tests =t YP ˜Yü =t (YP ˜Yü) Resampling Applications & Permutation Tests 1 / 12

Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Resampling Applications & Permutation Tests

박창이

서울시립대학교 통계학과

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 1 / 12

Page 2: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

학습 내용

Resampling Applications

Jackknife-after-Bootstrap

Resampling for Regression Models

Resampling cases

Resampling errors

Permutation Tests

Permutation distribution

Approximate permutation test

Two-sample Tests for Equal Distributions

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 2 / 12

Page 3: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Jackknife-after-Bootstrap

If one is interested in the variance of bootstrap estimates, then one

may try the jackknife.

Let J(i) be the indices of bootstrap samples that do not contain xi

and let B(i) denote the number of bootstrap samples that do not

contain xi . The jackknife estimate of s.e. is

se(θ) = se jack(seB(1), . . . , seB(n)),

where seB(i) =√

1B(i)

∑j∈J(i)[θ(j) − θ(J(i))]2 and

θ(J(i)) = 1B(i)

∑j∈J(i) θ(j) is the sample mean of the estimates from

the leave-xi -out jackknife samples.

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 3 / 12

Page 4: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Resampling Cases for Regression Models

Regression model:

Yj = β0 + β1xj + εj , j = 1, . . . , n,

where εi ’s are iid with E(εi ) = 0 and Var(εi ) = σ2.

Observed data: (xi , yi ), i = 1, . . . , n.

Algorithm

For r=1 to R:

1. Randomly draw {i∗1 , . . . , i∗n } from {1, . . . , n} with replacement.

2. Set x∗j = xi∗j and y∗j = yi∗j , j = 1, . . . , n.

3. Fit OLS to (x∗j , y∗j ), j = 1, . . . , n to get estimates β

(r)∗0 and β

(r)∗1 and

the residual s.e. s∗r .

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 4 / 12

Page 5: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Resampling Errors for Regression Models I

1. Fit OLS to data to obtain β = (β0, . . . , βp)T , σ2ε , and fitted values

y1, . . . , yn.

2. Compute the raw residuals ei = yi − yi and the modified residuals

e ′i = ei/√

1− hii , i = 1, . . . , n, where hjk = 1n +

(xj−x)(xk−x)SSx

is an jk

element of the hat matrix.

3. Center the modified residuals: ri = e ′i − e ′.

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 5 / 12

Page 6: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Resampling Errors for Regression Models II

4. For k = 1, . . . ,R:

1 For j = 1 to n:

1 Set x∗j = xj .

2 Randomly sample with replacement ε∗(k)j from {r1, . . . , rn}.

3 Set y∗(k)j = yj + ε

∗(k)j .

2 Fit OLS to (x∗j , y∗(k)j ), j = 1, . . . , n to get estimates β(k)∗ and the

residual s.e. s(k)∗.

3 Return (β(k)∗, s(k)∗).

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 6 / 12

Page 7: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Permutation Tests

Permutation tests are based on resampling, but the samples are

drawn without replacement.

Used for nonparametric tests for equal distributions, independence,

association, location, common scale, etc.

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 7 / 12

Page 8: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Permutation Distribution I

X1, . . . ,Xn and Y1, . . . ,Ym are independent random samples from FX

and FY .

Z = X ∪Y : pooled sample s.t. Zi = Xi if 1 ≤ i ≤ n and Zi = Yi−n if

n + 1 ≤ i ≤ n + m = N.

Z ∗ = (X ∗,Y ∗): a partition of Z , where X ∗ has n elements and Y ∗

has m elements. There is a permutation π of ν s.t. Z ∗i = Zπ(i), where

ν = {1, . . . , n, n + 1, . . . , n + m} = {1, . . . ,N}.

Under H0 : FX = FY , a randomly selected Z ∗ has probability1

(Nn)= n!m!

N! .

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 8 / 12

Page 9: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Permutation Distribution II

For θ(X ,Y ) = θ(Z , ν), the permutation distribution of θ∗ is the

distribution of {θ∗} = {θ(Z , πj(ν)), j = 1, . . . ,(Nn

)} = {θ(j) :

πj(ν) is a permutation of ν}.

The permutation test rejects the null hypothesis if θ is large relative

to the distribution Fθ∗(t) = P(θ∗ ≤ t) =(Nn

)−1∑Nj=1 I (θ

(j) ≤ t). The

achieved significance level is

P(θ∗ ≥ θ) =

(N

n

)−1 N∑j=1

I (θ(j) ≥ θ).

If the sample size is large, the permutation test is computationally too

expensive. In that case, we may use an approximate permutation test.

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 9 / 12

Page 10: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Approximate Permutation Test

1. Compute the observed test statistic θ(X ,Y ) = θ(Z , ν).

2. For b = 1, . . . ,B:

1 Generate a random permutation πb = π(ν).

2 Compute θ(b) = θ∗(Z , πb).

3. If large value of θ support H1, compute the empirical p-value by

p =

∑Bb=1 I (θ

(b) ≥ θ) + 1

B + 1.

For a lower-tail or two-tail test p is computed similarly.

4. Reject H0 at significance level α if p ≤ α.

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 10 / 12

Page 11: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Two-sample Tests for Equal Distributions I

X = (X1, . . . ,Xn) and Y = (Y1, . . . ,Ym) are independent random

samples from F and G .

H0 : F = G vs H1 : F 6= G .

Under H0, X , Y , and Z = X ∪ Y are random samples from F . Also,

under H0, any subset X ∗ of size n from Z and its complement Y ∗ are

random samples from F .

θ: a two-sample statistic measuring the distance between F and G .

Large values of θ support H1.

Under H0, all values of θ∗ = θ(X ∗,Y ∗) are equally likely.

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 11 / 12

Page 12: Resampling Applications & Permutation Testsstatlearn.uos.ac.kr/.../Ch9_10_slide.pdf · 2020-05-13 · Approximate permutation test Two-sample Tests for Equal Distributions =t (˝‚܉

Two-sample Tests for Equal Distributions II

Examples of test statistics

Kolmogorov-Smirnov statistic

D = sup1≤i≤n+m

|Fn(zi )− Gm(zi )|,

where Fn and Gm are ecdf’s of x1, . . . , xn and y1, . . . , ym. Note that

0 ≤ D ≤ 1.

Cramer-von Mises statistic

W2 =mn

(m + n)2

n∑i=1

(Fn(xi )− Gm(xi ))2 +m∑j=1

(Fn(yj)− Gm(yj))2

.

박창이 (서울시립대학교 통계학과) Resampling Applications & Permutation Tests 12 / 12