40
통통통 통통 (Statistical Inference) 삼삼삼삼삼삼삼삼삼 삼삼삼삼삼 삼삼삼 1

통계적 추론 (Statistical Inference)

  • Upload
    hakan

  • View
    86

  • Download
    0

Embed Size (px)

DESCRIPTION

통계적 추론 (Statistical Inference). 삼성생명과학연구소 통계지원팀 김선우. Statistical Inference. 통계적 방법과 자료 (Sample) 에 근거하여 모집단을 추측하는 것 Estimation ( 추정 ) Testing ( 검정 ). Population vs Sample. Population ( 모집단 ) 관심 정보를 얻고자 하는 대상의 전체 집합 관심 정보에 따라 모집단이 다르게 정의되기 때문에 무엇을 알고자 하느냐를 명확히 정의하는게 중요 시점 , 지역 등 명시 - PowerPoint PPT Presentation

Citation preview

Page 1: 통계적 추론 (Statistical Inference)

통계적 추론(Statistical Inference)

삼성생명과학연구소통계지원팀

김선우

1

Page 2: 통계적 추론 (Statistical Inference)

Statistical Inference

• 통계적 방법과 자료 (Sample) 에 근거하여 모집단을 추측하는 것

• Estimation ( 추정 )• Testing ( 검정 )

2

Page 3: 통계적 추론 (Statistical Inference)

Population vs Sample

• Population ( 모집단 )- 관심 정보를 얻고자 하는 대상의 전체 집합- 관심 정보에 따라 모집단이 다르게 정의되기 때문에

무엇을 알고자 하느냐를 명확히 정의하는게 중요- 시점 , 지역 등 명시

• Sample ( 표본 )- 모집단의 부분 집합

3

Page 4: 통계적 추론 (Statistical Inference)

Target (Study) population

• The group that we wish to study• The sample is selected from the study population

4

Page 5: 통계적 추론 (Statistical Inference)

Parameter vs Statistic

• Parameter ( 모수 )- 모집단의 양적 특성- Population mean, Population standard deviation,

Population proportion, …

• Statistic ( 통계량 )- 모수 추정을 위해 표본으로부터 산출되는 양적 특성- Sample mean, Sample standard deviation, Sample proportion, …

5

Page 6: 통계적 추론 (Statistical Inference)

6

• Example 6.8- Wish to characterize the distribution of

birthweight of all liveborn infants that were born in the US in 1988.

- Parameter of interest: Mean, SD of birthweight• Sample- Statistic: Mean, SD of birthweight from sample

Page 7: 통계적 추론 (Statistical Inference)

Random sample

• A selection of some members of the population such that each member is independently chosen and has a known non-zero probability of being selected.

• A simple random sample- A random sample in which each group member

has the same probability of being selected.

7

Page 8: 통계적 추론 (Statistical Inference)

Randomized Clinical Trial(RCT)

• A type of research design for comparing difference treatments, in which the assignment of treatments to patients is by some random mechanisms (randomization).

• Randomization- Assignment of treatment of an individual is

independent to assignment of treatment of other individuals.

- The types of patients assigned to different treatment modalities will be similar if the sample sizes are large. If sample sizes are small, then patient characteristics of treatment groups may not be comparable. Thus, it is customary to present a table of characteristics of different treatment groups in RCTs, to check that the randomization process is working well.

8

Page 9: 통계적 추론 (Statistical Inference)

Design features of randomized clinical trials

• Block randomization- For comparing two treatments (A, B), a block size

of 2n is determined in advance, where for every 2n patients entering the study, n patients are randomly assigned to treatment A and the remaining n patients are assigned to treatment B. A similar approach can be used in clinical trials with more than 2 treatment groups.

9

Page 10: 통계적 추론 (Statistical Inference)

10

• Stratification- Patients are subdivided into subgroups, or strata,

according to characteristics that are thought to be important for patient outcome. Separate randomization lists are maintained for each stratum to ensure that there are comparable patient populations within each stratum. Either random selection or block randomization might be used for each stratum.

Page 11: 통계적 추론 (Statistical Inference)

11

• Blinding- Double blind if neither the physician nor the

patient know what treatment he or she is getting.- Single blind if the patient is blinded as to

treatment assignment but the physician is not (vise versa).

- Blinding is always preferable to prevent biased reporting of outcome by the patient and/or the physician. However, it is not always feasible in all research setting

- Double dummy: Target drug + Placebo of standard drug, Standard drug + Placebo of target drug

Page 12: 통계적 추론 (Statistical Inference)

Estimation

• Point estimation ( 점추정 )• Interval estimation ( 구간추정 )

12

Page 13: 통계적 추론 (Statistical Inference)

Point estimation

• A natural estimator to use for estimating the population mean is the sample mean.

• This estimator has desirable properties (unbiased, minimum variance).

• Sampling distribution of the sample mean- The distribution of values of sample mean over all

possible samples of same size that could have been selected from the study population. (Figure 6.1)

13

Page 14: 통계적 추론 (Statistical Inference)

14

• Standard Error of the Mean (SEM)- The standard deviation of the sample means- A quantitative measure of the variability of sample

means obtained from repeated random samples of size n drawn from the same population.

- The standard deviation of population(σ)/√n- The larger the sample size, the more accurate an

estimator of mean will be.

• Example 6.23- SEM is given by SD of

sample(s)/√n=22.44/√10=7.09

Page 15: 통계적 추론 (Statistical Inference)

15

• Each estimator (ex: sample variance, sample proportion, etc) has its own standard error.

Ex) SE of sample proportion (p) = p(1-p)/√n

• The larger the sample size, the more accurate an estimator of corresponding parameter will be.

Page 16: 통계적 추론 (Statistical Inference)

16

• SD vs SEM

• SD- Variability of raw data

• SEM- Variability of the sample means

Page 17: 통계적 추론 (Statistical Inference)

Interval estimation

• 점 추정값은 주어진 표본으로부터 산출된 값으로 , 표본이 달라지면 점 추정값이 또한 달라질 수 있으므로 , 그 자체 variability 를 갖고 있음 ( 예 : SEM).

• Ex) Sample mean of cholesterol in SMC = 192Sample mean of cholesterol in SNU = 181Sample mean of cholesterol in CMC = 185…

• 따라서 관심 모수가 속해 있을 구간을 추정해 볼 필요가 있음 .

17

Page 18: 통계적 추론 (Statistical Inference)

18

• 95% Confidence interval for μ (population mean)- Over the collection of all 95% confidence intervals

that could be constructed from repeated random samples of size n, 95% will contain the parameter μ. (figure 6.6)

• Factors affecting the length of a CI- As the sample size increases, the length of CI

decreases.- As the SD, which reflects the variability of individual

observations, increases, the length of CI increases.- As the confidence desired (ex; 95%) increases, the

length of CI increases.

Page 19: 통계적 추론 (Statistical Inference)

19

• Example 6.30, 6.32, 6.33• Body temperature

• 95% CI for the mean = (sample mean-1.96*SE, sample mean+1.96*SE) ( SE = standard error of the mean = SD/√n)

• Sample mean=97.2, SD=0.2 for n=10• 95% CI for the mean = (97.08, 97.32)

• Sample mean=97.2, SD=0.2 for n=100• 95% CI for the mean = (97.16, 97.24)

• Sample mean=97.2, SD=0.4 for n=10• 95% CI for the mean = (96.95, 97.45)

Page 20: 통계적 추론 (Statistical Inference)

20

• Confidence interval can be estimated using either asymptotic method or exact method.

• For binary outcome, Asymptotic method for 95% CI for the proportion; (sample proportion-1.96*SE, sample

proprotion+1.96*SE) ( SE = Standard error of the proportion = √p(1-p)/n)

Example 6.45 p=0.04, n=10,000 95% CI for proportion = (0.036, 0.044)

Page 21: 통계적 추론 (Statistical Inference)

21

In the case of np(1-p)<5, exact interval should be estimated using binomial distribution.

• Example 6.47 n=20, p=0.1; np(1-p)=1.8<5 Exact 95% CI for p = (0.01, 0.32)

Page 22: 통계적 추론 (Statistical Inference)

Testing

• Research objective• Research question• Research hypothesis

22

Page 23: 통계적 추론 (Statistical Inference)

Hypothesis ( 가설 )

• 연구목적이 추상적 기술인 반면 , 가설은 실제 연구 수행( 설계 ~ 보고 ) 이 가능하도록 구체적이고 명확히 기술됨

• 연구목적과 부합• 분석 대상 포함• 비교 군이 명확히 포함• 비교 변수가 실제 측정 변수를 사용하여 기술• 기대하는 바가 반영되어야 함• 직접 통계적 검정이 가능하도록 작성되어야 함

23

Page 24: 통계적 추론 (Statistical Inference)

24

• 연구목적 : 새로운 항암제와 기존 항암제간 유효성 비교• 가설 ?• 새로운 항암제와 기존 항암제간 효과가 다르다 . Wrong!

• 4 기 유방암 환자에서 새로운 항암제 사용군과 기존 항암제 사용군간 3 개월 반응율이 차이가 있다 .

Page 25: 통계적 추론 (Statistical Inference)

귀무가설 vs 대립가설

• Alternative hypothesis ( 대립가설 ) (H1)- 연구자가 입증하고자 하는 바를 기술한 것

• Null hypothesis ( 귀무가설 ) (H0)- 대립가설과 반대되는 가설

25

Page 26: 통계적 추론 (Statistical Inference)

26

• 대립가설 : 4 기 유방암 환자에서 새로운 항암제 사용군과 기존 항암제 사용군간 3 개월 반응율이 차이가 있다 . (Non-equality test) 입증하고자 하는 바

• 귀무가설 : 4 기 유방암 환자에서 새로운 항암제 사용군과 기존 항암제 사용군간 3 개월 반응율이 차이가 없다 .

• 대립가설 : 전립선 수술환자에서 open surgery 와 robot surgery 방법간 3 개월째 PSA 수치 비정상 비율이 차이가 없다 . (Equivalence test) 입증하고자 하는 바

• 귀무가설 : 전립선 수술환자에서 open surgery 와 robot surgery 방법간 3 개월째 PSA 수치 비정상 비율이 차이가 있다 .

Page 27: 통계적 추론 (Statistical Inference)

Statistical testing

• 통계적 방법과 자료를 가지고 귀무가설 기각 여부에 대한 판정을 내리는 것

• 귀무가설이 ‘참’이라는 가정하에 검정을 수행하는 것으로 , 귀무가설이 기각될만큼의 충분한 증거가 있을 때에만 귀무가설을 기각

• If the null hypothesis is rejected, there is sufficient evidence to reject the null hypothesis, and the alternative hypothesis can be proved.

• If the null hypothesis is not rejected, the null hypothesis may be true or the evidence of the alternative hypothesis is not sufficient to reject the null hypothesis even though the alternative hypothesis is true. Thus, if the null hypothesis is not rejected, we can not say that the null hypothesis is true.

27

Page 28: 통계적 추론 (Statistical Inference)

28

• 대립가설 : 4 기 유방암 환자에서 새로운 항암제 사용군과 기존 항암제 사용군간 3 개월 반응율이 차이가 있다 . (Non-equality test) 입증하고자 하는 바

• 귀무가설 : 4 기 유방암 환자에서 새로운 항암제 사용군과 기존 항암제 사용군간 3 개월 반응율이 차이가 없다 .

• 귀무가설이 기각될 경우 ; 두 군간 3 개월 반응율이

차이가 있다 .• 귀무가설이 기각되지 못할 경우 ; 두 군간 3 개월

반응율이 차이가 있다고 할 수 없다 .

Page 29: 통계적 추론 (Statistical Inference)

29

• 대립가설 : 전립선 수술환자에서 open surgery 와 robot surgery 방법간 3 개월째 PSA 수치 비정상 비율이 차이가 없다 . (Equivalence test) 입증하고자 하는 바

• 귀무가설 : 전립선 수술환자에서 open surgery 와 robot surgery 방법간 3 개월째 PSA 수치 비정상 비율이 차이가 있다 .

• 귀무가설이 기각될 경우 , 3 개월째 PSA 수치 비정상

비율이 두 군간 차이가 없다 .• 귀무가설이 기각되지 못할 경우 , 3 개월째 PSA 수치

비정상 비율이 두 군간 같다고 말할 수 없다 .

Page 30: 통계적 추론 (Statistical Inference)

30

• Four possible outcomes in hypothesis testingTruth

Decision from testing

Null hypothesis Alternative hypothesis

Do not reject null Correct Incorrect(Type II error; False negative error)

Reject null Incorrect(Type I error; False positive error)

Correct

Page 31: 통계적 추론 (Statistical Inference)

31

• The probability of a type I error is usually denoted by α and is commonly referred to as the significance level of a test. (false positive error 의 감내할 수 있는 최대 크기 )

• The probability of a type II error is usually denoted by β.

• The power of a test is defined as 1-β.

• The general aim in hypothesis testing is to use statistical tests that make α and β as small as possible. This goal requires compromise, since making α small involves rejecting the null hypothesis less often, whereas making β small involves accepting the null hypothesis less often. Contradictory; that is, as α increases, β will decrease vice versa. General strategy is to fix α at some specific level, (ex; 0.1, 0.05, 0.01, etc) and to use the test that minimizes β (or maximizes the power).

Page 32: 통계적 추론 (Statistical Inference)

One-sided test vs Two-sided test

• A one-sided test is a test in which the values of the parameter being studied (ex. Population mean, μ) under the alternative hypothesis are allowed to be either greater than or less than the values of the parameter under the null hypothesis but not both.

• Example 7.2, 7.10 (SD=25)- H0: μ=120, H1: μ<120- Sample mean=115- How sample mean is small in order to reject H0? Need the rejection region (the range of values of

sample mean for which H0 is rejected)

32

Page 33: 통계적 추론 (Statistical Inference)

33

• Use the probability of type I error (α).• α = P(reject H0 | H0 is true) = P(sample mean < C |

μ=120) ( 표준화 필요 Z = (sample mean – μ) / (σ/√n),

Z~N(0,1), 즉 Z 는 평균 0, 분산 1 인 표준정규분포를 따름 )• With SD=25, n=100, α=0.05, 0.05 = P(Z < (C-120)/(25/√100)) (C-120)/(25/√100) = Z0.05 = -1.645 C=115.89• Reject H0 if sample mean < 115.89 under α = 0.05.

• From sample data, sample mean=115, reject H0 under α = 0.05

This approach depends on the size of type I error (α) to decide whether the null is rejected.

Page 34: 통계적 추론 (Statistical Inference)

34

• Significance tests can be effectively performed at

all α levels by obtaining the p-value for the test.

• P-value- Probability; (0, 1)- 자료가 귀무가설을 지지하는 정도- 귀무가설이 맞다고 가정했을 때 , 자료로부터 산출한

통계값 ( 예 : 표본평균 ) 보다 더 극단적인 결과 ( 즉 대립가설에 유리하게 나오는 것 ) 가 나올 확률

Page 35: 통계적 추론 (Statistical Inference)

35

• P-value = P( 표본평균 <115 | μ=120) = P( 표준화된 표본평균 < (115-120)/(25/√100)) = P(Z<-2.0) = 0.02275

• 귀무가설이 맞다고 가정함으로써 귀무가설 (μ=120) 을 기준으로 하여 , 관찰된 통계값 (ex; 115) 이 거기서 얼마나 떨어져있나를 보는 것 .

• 멀리 떨어져 있으면 p-value 가 작아 귀무가설을 부정하게되고 , 가까우면 p-value 가 커서 귀무가설을 부정하지 않게됨

Page 36: 통계적 추론 (Statistical Inference)

36

• Significance level (α); a pre-chosen probability• P-value; a probability calculated after a given

study

• P-value 는 표본의 크기가 크면 임상적 , 실제적으로 의미없는 차이에도 작게 산출되어 통계적으로 유의하다고 할 수 있으므로 , 통계적 유의성이 곧 임상적 , 실제적 유의성을 보장하지는 않음

• 따라서 , p-value 와 신뢰구간을 함께 제시하는 것이 바람직함

Page 37: 통계적 추론 (Statistical Inference)

37

• A two-sided test is a test in which the values of the parameter being studied under the alternative hypothesis are allowed to be either greater than or less than the values of the parameter under the null hypothesis.

• Example 7.19- H0: μ=190, H1: μ≠190

• A reasonable decision rule to test for alternative on either side of the null mean is to reject H0 if sample mean is either too small or too large.

Page 38: 통계적 추론 (Statistical Inference)

38

• α = P(reject H0 | H0 is true) = P(sample mean < C1 or > C2 | H0 is true) = P(sample mean < C1 | H0 is true) + P(sample mean > C2 | H0 is true)

• For comparison of two means, half of the type I error is arbitrarily assigned to each of the probabilities.

• P(sample mean < C1 | H0 is true) = P(sample mean > C2 | H0 is true) = α/2

Page 39: 통계적 추론 (Statistical Inference)

39

• Sample mean given the data = 181.52• P-value = 2*P(sample mean < 181.52 | μ=190) = 2*P(Z < (181.52-190)/(40/√100)) = 2*P(Z<-2.12) = 2*0.017=0.034

Page 40: 통계적 추론 (Statistical Inference)

Relationship between hypothesis testing and

confidence interval• For two-sided cases, H0 (μ=μ0) is rejected with a

two-sided level α test if and only if the two-sided 100%*(1-α) confidence interval for parameter does not contain μ0.

• Example 7.40 H0 (μ=190)- 95% CI for cholesterol mean = (sample mean-1.96*σ/√n, sample

mean+1.96*σ/√n) = (181.52-1.96*40/ √100, 181.52+1.96*40/√100) = (173.68, 189.36)- P-value was 0.034.

40