Sample Size

Get some input from a statistician

This part of the design is vital and

mistakes can be costly!

Take all calculations with a few grains of

salt

Fudge factor is important!

Analysis Follows Design

Power

Basic Sample Size Information

Examples

Changes to the basic formula

Poor proposal sample size statements

Conclusion

Power = 1- = P( reject H0 | H1 true )

Probability of rejecting the null hypothesis if the alternative hypothesis

is true.

More subjects Higher power

With the given sample size,

Variation in the outcome (2; variance)

2 power Significance level ()

power Difference (effect) to be detected ()

power One-tailed vs. two-tailed tests

Power is greater in one-tailed tests than in comparable two-tailed tests

2n = 32, 2 sample test, 81% power, =2,

= 2, = 0.05, 2-sided test

Variance/Standard deviation

: 2 1 Power: 81% 99.99% : 2 3 Power: 81% 47%

Significance level ()

: 0.05 0.01 Power: 81% 69% : 0.05 0.10 Power: 81% 94%

2n = 32, 2 sample test, 81% power, =2, = 2, = 0.05, 2-sided test

Difference to be detected () : 2 1 Power: 81% 29% : 2 3 Power: 81% 99%

Sample size (n) n: 32 64 Power: 81% 98% n: 32 28 Power: 81% 75%

One-tailed vs. two-tailed tests Power: 81% 88%

Phase III: industry minimum = 80%

Proteomics/genomics studies: aim for

high power because Type II error a bear!

Phase II: sometime > 80%

Depends on study design

Not hard, but can be VERY algebra

intensive

May want to use a computer program or

statistician

Variables of interest (Primary outcome)

type of data e.g. continuous, categorical

Desired power

Desired significance level

Effect/difference of clinical importance

Variability (Standard deviations)

One or two-sided tests

Randomized controlled trial (RCT)

Block/stratified-block randomized trial

Equivalence trial

Non-randomized intervention study

Observational study

Prevalence study

Measuring sensitivity and specificity

The number of compared groups

One group

Two groups

More than two groups

Independence of the compared groups

Independent groups

Paired (Matched) groups

Repeated measures

Groups of equal sizes

Hierarchical data

Non-randomized studies looking for

differences or associations

require larger sample to allow

adjustment for confounding factors

Absolute sample size is of interest

surveys sometimes take % of population

approach

Studys primary outcome is the variable you do the sample size calculation for

If secondary outcome variables considered important make sure sample size is sufficient

Increase the real sample size to reflect loss to follow up, expected response rate, lack of compliance, etc.

Make the link between the calculation and increase

Demonstrate superiority

Sample size sufficient to detect

difference between treatments

Demonstrate equally effective

Equivalence trial

Sample size required to demonstrate

equivalence larger than required to

demonstrate a difference

Two groups

Continuous outcome

Mean difference

Similar ideas hold for other outcomes

Primary objective: determine if patients taking

the new sleep aid have longer sleep

1 sample test

Primary outcome: Change difference of sleep

time from baseline to after taking the

medication for one week

Two-sided test, = 0.05, power = 90%

Effect size = 1 (4 hours of sleep to 5)

Standard deviation of the primary outcome = 2

hr

1 sample test

2-sided test, = 0.05, 1- = 90%

= 2hr (standard deviation)

= 1 hr (difference of interest)

2 2 2 21 / 2 1

2 2

( ) (1.960 1.282) 242.04 43

1

Z Zn

Change difference of interest from 1hr to 2 hr

n goes from 43 to 11

2 2

2

(1.960 1.282) 210.51 11

2n

Change power from 90% to 80%

n goes from 11 to 8

(Small sample: start thinking about using the t distribution)

2 2

2

(1.960 0.841) 27.85 8

2n

Change the standard deviation from 2 to 3

n goes from 8 to 18

2 2

2

(1.960 0.841) 317.65 18

2n

Original design (2-sided test, = 0.05, 1- = 90%, = 2hr, = 1 hr)

Two sample randomized parallel design

Needed 43 in the one-sample design

In 2-sample need twice that, in each group!

4 times as many people are needed in this design

2 2 2 21 / 2 1

2 2

2( ) 2(1.960 1.282) 284.1 85 170 total!

1

Z Zn

Changes in the detectable difference have HUGE impacts on sample size 20 point difference 25 patients/group 10 point difference 100 patients/group 5 point difference 400 patients/group

Changes in , , , number of samples, if it is a 1- or 2-sided test can all have a large impact on your sample size calculation

2 2

1 / 2 1

2

4( )2

Z ZN

Primary objective: Compare the proportions

of the success (for the primary outcome)

between the control group and the

treatment group

Proportion of the success in control group

(p1)

Effect size ()( The expected proportion of the success in treatment group; p2=p1+

, p2=p1- )

Similar to 1-sample formula

Means (paired t-test)

Mean difference from paired data

Variance of differences

Proportions

Based on discordant pairs

Similar to non-equality study

Use equivalence margin instead of effect

size

Use Z1- instead of Z1-/2Use Z1-/2 instead of Z1-

Similar to non-equality study

Use non-inferiority margin instead of

effect size

Ratio of cases to controls

Use if want patients randomized to the treatment arm for every patient randomized to the placebo arm

Take no more than 4-5 controls/case

2 1

2 2 2

1 / 2 1 1 2

1 2

controls for every case

( ) ( / )

n n

Z Zn

Cohort of exposed and unexposed people

Relative Risk = R

Prevalence in the unexposed population =

1

2

1 / 2 1

1 2

2 1

1 2

Risk of event in exposed group

Risk of event in unxposed group

( )#of events in unexposed group

2( 1)

#events in exposed group

and are the number of events in the two groups

req

R

Z Zn

R

n Rn

n n

1 1

uired to detect a relative risk of R with power 1-

/ # subjects per groupN n

At least 10 subjects for every variable investigated In logistic regression No general justification This is stability, not power Peduzzi et al., (1985) biased regression

coefficients and variance estimatesPrinciple component analysis (PCA)

(Thorndike 1978 p 184): N10m+50 or even N m2 + 50

Equal numbers in two groups is the

easiest to handle

If you have more than two groups, still,

equal sample sizes easiest

Complicated design = simulations

Done by the statistician

"A previous study in this area recruited

150 subjects and found highly significant

results (p=0.014), and therefore a similar

sample size should be sufficient here."

Previous studies may have been 'lucky'

to find significant results, due to

random sampling variation.

"Sample sizes are not provided because there is no prior information on which to base them."

Find previously published information

Conduct small pre-study

If a very preliminary pilot study, sample size calculations not usually necessary

No prior information on standard

deviations

Give the size of difference that may be

detected in terms of number of standard

deviations

Make a sample size or power table

Use a wide variety of possible standard

deviations

Protect with high sample size if possible

"The clinic sees around 50 patients a year, of whom 10% may refuse to take part in the study. Therefore over the 2 years of the study, the sample size will be 90 patients. "

Although most studies need to balance feasibility with study power, the sample size should not be decided on the number of available patients alone.

If you know # of patients is an issue, can phrase in terms of power

Questions Hypotheses

Experimental Design Samples

Data Analyses Conclusions

Take all of your design information to a

statistician early and often

Guidance

Assumptions

Documents

Sample Size