Pflege Deine Vorurteile! - uni-kiel.de · Hypothese Beobachtung Schlußfolgern in der Statistik (induktiv) p-Wert Sir Ronald A. Fisher (1890-1962)

A statistician is a person who draws a mathematically precise line

from an unwarranted assumption to a foregone conclusion.

Pflege Deine Vorurteile!

Quo vadis ?

Statistik als wissenschaftliche Qualitätskontrolle

Quo vadis ?

Statistik als wissenschaftliche Qualitätskontrolle

1. Ein p-Wert ist ein p-Wert ist ein p-Wert(und kein Test)

Ursache Wirkung

deduktiv

induktiv

Zwei Arten logischen Schlußfolgerns

Hypothese Beobachtung

?

?

Schlußfolgern in der Statistik

Hypothese

Beobachtung

Schlußfolgern in der Statistik (induktiv)

p-Wert

Sir Ronald A. Fisher (1890-1962)

Beobachtung xobs

Teststatistik T:x→T(x)

Hypothese H0

p = PH0(x:T(x)≥T(xobs))

Der p-Wert

T(xobs)

p

Der p-Wert

T

PH0

“... an informal index to be used as a measure of discrepancy between the data and the null hypothesis.”

Goodman SN (1999) Ann Intern Med 130: 995-1004

“No test based upon the theory of probabilitycan by itself provide any valuable evidence ofthe truth or falsehood of a hypothesis”

Neyman J, Pearson E (1933) Phil Trans R Soc A, 231:289-337

Das Theorem von Bayes

Thomas Bayes (1702-1761)

P(H0|xobs) ∝ P(H0)⋅p

p-Wert als “Entscheidungskriterium“

p=0.10

unerwartet “shows that the two groups are equivalent“

erwartet “trend of borderline significance“

“not statistically significant, most probablybecause of small sample size“

p-Wert als “Entscheidungskriterium“

p=0.10

unerwartet “shows that the two groups are equivalent“

erwartet “trend of borderline significance“

“not statistically significant, most probablybecause of small sample size“

p=0.01unerwartet “in all likelihood represents a false positive“

erwartet

“reflects unknown bias“

“clearly demonstrates a treatment effect“

statistisches Testen

Schlußfolgern in der Statistik (deduktiv)

Jerzy Neyman (1894-1981)

Egon Pearson (1895-1980)

Hypothese

Beobachtung

“Without hoping to know whether each separatehypothesis is true or false, we may search forrules to govern our behavior with regard to them,in following which we insure that, in the longrun of experience, we shall not often be wrong.”

Neyman J, Pearson E (1933) Phil Trans R Soc A, 231:289-337

Statistisches Testen

Beobachtungen xTeststatistik T:x→T(x)

Fehler 2. Art β

Hypothesen H0, H1

Wähle Cα so, daß PH0(x:T(x)>Cα)≤α

Fehler 1. Art α


PH1(x:T(x)≤Cα)≤β


Cα

αααα

H0 H1

T(xobs)≤Cα H0

T(xobs)>Cα H1

Tββββ

PH0 PH1

Das sogenannte „multiple Testproblem“

Placebo vs. Behandlung A 0.125Placebo vs. Behandlung B 0.015Placebo vs. Behandlung C 0.045

p



0.053

p pcrit



Placebo vs. Behandlung D 0.020Placebo vs. Behandlung E 0.005

0.053

0.052

p pcrit




0.053

0.052

0.055

p pcrit pcrit




0.053

0.052

0.055

p pcrit pcrit

entwederH0A H0B H0C H0D H0E




0.053

0.052

0.055

p pcrit pcrit


oderH0A H0B H0C H0D H0E




0.053

0.052

0.055

p pcrit pcrit


oderH0A H0B H0C H0D H0E

aber nicht

H0A H0B H0C H0D H0E

2. Was ist schon/noch „normal“?

PH0(x:T(x)≥T(xobs))p-Wert

Cα

PH0(x:T(x)>Cα)Test

Verteilung: Fakt und Fiktion

Cα

PH0(x:T(x)≥T(xobs))p-Wert

PH0(x:T(x)>Cα)Test

Verteilung: Fakt und Fiktion

Randomisierungs-Tests„statistics without tears“

x1 x2 x3 x4 x5 x6 x7 x8 x9xobs T(xobs)

Fälle Kontrollen



x1x2x3 x4 x5x6 x7x8 x9π1(xobs) T(π1(xobs))

Fälle Kontrollen



x1x2x3 x4 x5x6 x7x8 x9

x1 x2 x3x4x5x6 x7x8 x9

x1 x2 x3x4x5 x6 x7x8x9

x1 x2x3 x4 x5 x6x7x8 x9

π1(xobs)π2(xobs)π3(xobs)π4(xobs)

T(π1(xobs))T(π2(xobs))T(π3(xobs))T(π4(xobs))

PH0(T(x))

Fälle Kontrollen

3. Ein p-Wert mißt keine Effektgröße(signifikant ist nicht gleich „signifikant“)

Kleine Studie, großer Effekt ...

Verum

Placebo

Erfolg ∅Erfolg

40 10

25 25

50

50

65 35 100Σ

Σ

χ2=8.62, 1 df, p=0.004 OR=4.000 CI: 1.517-10.749

Kleine Studie, großer Effekt ...

Verum

Placebo

Erfolg ∅Erfolg

40 10

25 25

50

50

65 35 100Σ

Σ

Verum

Placebo

Erfolg ∅Erfolg

2648 2352

2500 2500

5000

5000

5148 4852 10000Σ

Σ

χ2=8.62, 1 df, p=0.004

χ2=8.62, 1 df, p=0.004

OR=4.000 CI: 1.517-10.749

OR=1.126 CI: 1.040-1.219

4. post hoc ergo propter hoc(die Sache mit dem Klapperstorch)

Von Störchen und Babys

0

10

2030

40

50

60

7080

Jan

Mar

May Ju

l

Sep

Nov

Störche Geburten

r = 0.898

Scheinkorrelation/assoziation

A B

C

A B

C

I. II.

A: Geschlecht, B: Verhalten, C: Erziehung


A B

C

A B

C

I. II.


A: Therapieform, B: Morbidität, C: Mobilität


A B

C

A B

C

I. II.


A: Ernährung, B: Lebensdauer, C: Sozialisation



A B

C

A B

C

I. II.


A: Ernährung, B: Lebensdauer, C: Sozialisation


A: Mobiltelefonieren, B: Schlafstörungen, C: Lebensweise

Alle Confounder bedacht?



... Augen links


... Augen rechts

„The DDT ban myth“

Malaria-Prävalenz in Sri Lanka

1948 2,800,0001958 Beginn des DDT Einsatzes1962 Silent Spring (Rachel Carson)

1964 Verbot von DDT1968 1,000,0001969 2,500,000

1963 17

„The DDT ban myth“

Malaria-Prävalenz in Sri Lanka

1948 2,800,0001958 Beginn des DDT Einsatzes1962 Silent Spring (Rachel Carson)

1964 Verbot von DDT1968 1,000,0001969 2,500,000

1963 17Resistenzbildung !

5. Skalieren, Normieren, Standardisieren(Bilder sagen mehr als tausend Worte)

0

20

40

60

80

100

120

1960 1970

Year

Inzi

denc

e

Schwache Trends, starke Trends


Inzi

denc

e

Year

100

102

104

106

108

110

1960 1970

Year

Inci

denc

e


0

20

40

60

80

100

120

1960 1970 1980 1990

Year

Inzi

denc

e


100

110

120

130

140

150

1960/80 1970/90

Year

Inzi

denc

e (%

)


The End

Documents

Pflege Deine Vorurteile! - uni-kiel.de · Hypothese Beobachtung Schlußfolgern in der Statistik (induktiv) p-Wert Sir Ronald A. Fisher (1890-1962)