205
1 Statistical Tools for the Quality Control Laboratory and Validation Studies: Session 1 STEVEN S. KUWAHARA, Ph.D. GXP BioTechnology LLC PMB #506 1669-2 Hollenbeck Avenue Sunnyvale, CA 94087-5042 USA Tel. & FAX 408-530-9338 e-Mail: [email protected] Website: www.gxpbiotech.org IVTPHL1012S1

Statistical Tools for the Quality Control Laboratory and Validation Studies

Embed Size (px)

DESCRIPTION

An overview of statistical control, design of experiment, and monitoring and controlling validated processes.

Citation preview

Page 1: Statistical Tools for the Quality Control Laboratory and Validation Studies

1

Statistical Tools for the Quality Control Laboratory and Validation Studies:

Session 1

l  STEVEN S. KUWAHARA, Ph.D. l  GXP BioTechnology LLC

l  PMB #506 l  1669-2 Hollenbeck Avenue

l  Sunnyvale, CA 94087-5042 USA l  Tel. & FAX 408-530-9338

l  e-Mail: [email protected] l  Website: www.gxpbiotech.org

IVTPHL1012S1

Page 2: Statistical Tools for the Quality Control Laboratory and Validation Studies

2

NORMAL DISTRIBUTION

22/1

21 ⎟

⎞⎜⎝

⎛ −−

Π= σ

µ

σ

iX

eY

IVTPHL1012S1

Page 3: Statistical Tools for the Quality Control Laboratory and Validation Studies

3 IVTPHL1012S1

Page 4: Statistical Tools for the Quality Control Laboratory and Validation Studies

4 IVTPHL1012S1

Page 5: Statistical Tools for the Quality Control Laboratory and Validation Studies

5 IVTPHL1012S1

Page 6: Statistical Tools for the Quality Control Laboratory and Validation Studies

6

NORMAL DISTRIBUTION PROPERTIES

l  The normal distribution has the following properties: l  Bell-shaped l  Unimodal l  Symmetrical l  Extends from -∞ to +∞ (tails never reach zero frequency) l  Same value for mean, median, and mode l  This pattern of variation is common for manufacturing processes.

IVTPHL1012S1

Page 7: Statistical Tools for the Quality Control Laboratory and Validation Studies

7 IVTPHL1012S1

Page 8: Statistical Tools for the Quality Control Laboratory and Validation Studies

8

VARIANCE (S2)

( )

( )

( )( )1

1

1

222

22

22

2

Σ−Σ=

−Σ=

Σ−Σ

=

nnXXnS

nXXS

nnXX

S

ii

i

ii

IVTPHL1012S1

Page 9: Statistical Tools for the Quality Control Laboratory and Validation Studies

Averages and Standard Deviations and the SEM. 1.

l  All of the n measurements that go into the mean () must be measurements of the same thing. l  The mean of fruits and the mean of oranges are

different things unless all of the fruits are oranges. l  But then it is still the mean of oranges not fruits.

l  The standard deviation (s) is a measure of the variation among the n components of NOT the variation of itself. l  Thus the next item (n + 1) from the original population

should have a 95% chance of being within ± 1.96s of but not the next average (1).

Page 10: Statistical Tools for the Quality Control Laboratory and Validation Studies

Averages and Standard Deviations and the SEM. 2.

l  The variation in the averages is the standard error of the mean (SEM) which is: s/√n. l  Thus the next average (1) has a 95% probability

of being within ±1.96(s/√n) or ±1.96SEM of the original mean ().

l  When dealing with single numbers, s is used, but when dealing with means the SEM is the number to use. l  It is incorrect to use s to set a specification on a

value that is actually an average.

Page 11: Statistical Tools for the Quality Control Laboratory and Validation Studies

11

RANGE AND C.V.

l  The range can be related to the standard deviation for n<16.

RSDXXSVC

ddXXs sL

%100..

alue. tabular va is 22

==

−=

IVTPHL1012S1

Page 12: Statistical Tools for the Quality Control Laboratory and Validation Studies

12

F - TEST

98.228.9F :Note

s.experiment factorial andANOVAfor used istest that -F thefrom

different slightly is This :Note

10,10,05.0

0.05,3,3

21

22

2,1,

=

=

=

F

ssF dfdfα

Page 13: Statistical Tools for the Quality Control Laboratory and Validation Studies

13

Student’s t

ances.known vari

averages,t Independen

1

form. Basic

2

22

1

21

21

nn

xxt

ndfn

sxt

σσ

µ

+

−=

−=

−=

Page 14: Statistical Tools for the Quality Control Laboratory and Validation Studies

14

t-TEST vs THEORETICAL OR KNOWN VALUE

l  CHON Analysis. 9.55% H calculated. l  Data: 9.17, 9.09, 9.14, 9.10, 9.13, 9.27. n = 6, = 9.15,

s = ± 0.0654 l  t0.05/2, 5= 2.57, t0.01/2, 5 = 4.032, t0.001/2, 5 = 6.869, p < 0.001

98.146

0654.055.915.9

=−

=−

=

nsxt µ

Page 15: Statistical Tools for the Quality Control Laboratory and Validation Studies

15

KNOWN VARIANCES, t-TEST OF TWO AVERAGES

l  Karl Fischer H2O. σ = 0.025 from historical data. l  Data: Lot A: 0.50, 0.53, 0.47. l  Lot B: 0.53, 0.56, 0.51, 0.53, 0.50 l  n1=3, n2=5, x1=0.500, x2=0.526

l  t0.05/2.∞=1.96, df = n1 + n2 – 2 = 6, t0.05/2, 6 =2.447

( ) ( )424.1

5025.0

3025.0

526.0500.022=

+

−=t

Page 16: Statistical Tools for the Quality Control Laboratory and Validation Studies

16

t for Unknown and Equal Variances

221

2ps21= t21n if

21

2121

−+=

−=

+

−=

nndf

nxxn

nn

nn

ps

xxt

Page 17: Statistical Tools for the Quality Control Laboratory and Validation Studies

17

t-TEST, UNKNOWN BUT EQUAL VARIANCES, 1.

l  Data (mg/L Fe3+): Lot A: 6.1, 5.8, 7.0. l  Lot B: 5.9, 5.7, 6.1. xA=6.30, sA=0.6245, xB=5.90,

sB=0.2000.

( )( )

( ) ( )( ) ( )

4637.0131320.026245.02

75.92000.06245.0

00.39

22

2

2

2,2,2/05.0

=−+−

+=

==

=

Ps

F

F

Page 18: Statistical Tools for the Quality Control Laboratory and Validation Studies

18

t-TEST UNKNOWN BUT EQUAL VARIANCES. 2.

l  df = n1 + n2 - 2 df = 4

78.2

056.13333

4637.090.530.6

4,2/05.0 =

=+

−=

t

Xt

Page 19: Statistical Tools for the Quality Control Laboratory and Validation Studies

19

POOLED VARIANCE

( ) ( )211

21

222

211

−+−+−

=nn

snsnsp

Page 20: Statistical Tools for the Quality Control Laboratory and Validation Studies

20

t for Independent Averages with unknown and unequal variances.

2

11 2

2

2

22

1

2

1

21

2

22

1

21

2

22

1

21

21

+

⎟⎟⎠

⎞⎜⎜⎝

++

⎟⎟⎠

⎞⎜⎜⎝

⎟⎟⎠

⎞⎜⎜⎝

⎛+

=

+

−=

nns

nns

ns

ns

df

ns

nsxxt

Page 21: Statistical Tools for the Quality Control Laboratory and Validation Studies

21

t-TEST UNKNOWN AND UNEQUAL VARIANCES, 1.

l  Data:Extension of Previous Fe+3 mg/L study l  xA = 6.13, sA = 0.3529 l  xB = 5.76, sB = 0.1647

l  nA = nB = 10 l  F0.05/2,9,9 = 4.03

l  F = (0.3529)2 / (0.1647)2 l  F = 4.59

1 6.1 5.92 5.8 5.73 7.0 6.14 6.1 5.85 6.1 5.76 6.4 5.67 6.1 5.68 6.0 5.99 5.9 5.710 5.8 5.6

Page 22: Statistical Tools for the Quality Control Laboratory and Validation Studies

22

t-TEST UNKNOWN AND UNEQUAL VARIANCES, 2.

l  t.05/2,17 = 2.110

0044.30151664.037.0

10

21647.010

23529.0

76.513.6

==

+

−=⎟⎠⎞⎜

⎝⎛⎟

⎠⎞⎜

⎝⎛

t

t

Page 23: Statistical Tools for the Quality Control Laboratory and Validation Studies

23

t-TEST UNKNOWN AND UNEQUAL VARIANCES, 3.

2

11 2

2

22

1

2

1

21

2

2

22

1

21

+

⎟⎟⎠

⎞⎜⎜⎝

++

⎟⎟⎠

⎞⎜⎜⎝

⎟⎟⎠

⎞⎜⎜⎝

⎛+

=

nns

nns

ns

ns

df

Page 24: Statistical Tools for the Quality Control Laboratory and Validation Studies

24

t-TEST UNKNOWN AND UNEQUAL VARIANCES, 4.

( )( ) ( )

number wholea torounded 1723.19000081.0

0015666.00000669.00000141.0

0015666.0

2

110271261.0

1101245384.0

0395799.022

2

=−=

=+

=

⎥⎥⎥⎥

⎢⎢⎢⎢

+

=

df

df

df

Page 25: Statistical Tools for the Quality Control Laboratory and Validation Studies

25

Paired t-Test

( )

1

1

22

21

∑∑−

=

−=−==

nndd

s

ndfxxdnsdt

d

iid

Page 26: Statistical Tools for the Quality Control Laboratory and Validation Studies

26

DATA FOR t -TESTS l  Sample New Original d l  1. 12.1% 14.7% 2.6% l  2. 10.9 14.0 3.1 l  3. 13.1 12.9 -0.2 l  4. 14.5 16.2 1.7 l  5. 9.6 10.2 0.6 l  6. 11.2 12.4 1.2 l  7. 9.8 12.0 2.2 l  8. 13.7 14.8 1.1 l  9. 12.0 11.8 -0.2 l  10 9.1 9.7 0.6 l  ave. 11.60 12.87 1.27 l  s 1.814 2.075 1.126

Page 27: Statistical Tools for the Quality Control Laboratory and Validation Studies

27

Paired t-Test Calculation

exists. difference tsignifican a Therefore

26.2

567.310126.127.1

9,2/05.0 =

===

t

nSdtd

Page 28: Statistical Tools for the Quality Control Laboratory and Validation Studies

28

t-Test for unknown but equal variances.

l  Showing that there is no significant difference?

10.2t182ndf 457.1

1010100

9488.187.1260.11

0.05/2,18

21

21

2121

=

=−+==+

−=

+

−=

nt

nnnn

SXXtp

Page 29: Statistical Tools for the Quality Control Laboratory and Validation Studies

29

Student’s t to a C.I.

.confidence desired theand freedom of degrees 1-nfor

table- ta from taken is t of valueThe

1

nts form. Basic

ntsx

ndf

xn

sxt

±=

−=

−=−

=

µ

µµ

Page 30: Statistical Tools for the Quality Control Laboratory and Validation Studies

30

CONFIDENCE INTERVAL 1.

30.4

..

96.1..

2,05.01,05.0 =

±=

±=

− ttntsXIC

IC

n

σµ

Page 31: Statistical Tools for the Quality Control Laboratory and Validation Studies

}  67.0 65.8 78.1 66.4 69.0 70.5 }  67.5 75.6 74.2 74.5 85.0 81.1 }  76.0 71.9 70.8 67.3 75.0 74.0 }  72.7 68.8 84.9 73.2 74.7 76.6 }  73.1 82.6 72.2 68.7 69.5 64.2

}  n = 30, range = 64.2 - 85.0 range = 20.8 }  Ave. = 73.03 s or σ = 5.4416 SQRT(30) = 5.4772 }  t0.995, 29=2.756 99%C.I.(t) = 70.29 - 75.77

IVTPHL1012S1 31

DATA SET FOR SETTING SPECS. 1.

Page 32: Statistical Tools for the Quality Control Laboratory and Validation Studies

l  67.0 72.7 71.9 82.6 70.8 66.4 73.2 85.0 69.5 74.0 l  67.5 73.1 68.8 78.1 84.9 74.5 68.7 75.0 70.5 64.2 l  76.0 65.8 75.6 74.2 72.2 67.3 69.0 74.7 81.1 64.2

l  Ave.70.2 70.5 72.1 78.3 76.0 69.4 70.3 78.2 73.7 71.6 l  s = 5.06 4.10 3.40 4.20 7.77 4.44 2.52 5.86 6.43 6.54 l  CV. 7.21 5.82 4.72 5.37 10.23 6.40 3.58 7.49 8.72 9.13 l  CI ±29.0 23.5 19.5 24.1 44.5 25.4 14.4 33.6 36.8 37.5 l  X3 = 73.03, s = 3.36, C.V.=3.5%, n=10, t0.995,9 = 3.250 l  99%C.I.(ave) = ±3.46 = 69.67 - 76.49

IVTPHL1012S1 32

DATA SET FOR SETTING SPECS. 2. SETS OF 3

Page 33: Statistical Tools for the Quality Control Laboratory and Validation Studies

}  Set A: 67.0 67.5 76.0 72.7 73.1 65.8 75.6 71.9 68.8 82.6 }  Set B: 78.1 74.2 70.8 84.9 72.2 66.4 74.5 67.3 73.2 68.7 }  Set C: 69.0 85.0 75.0 74.7 69.5 70.5 81.1 74.0 76.6 64.2 }  SQRT(10) = 3.162278 t0.995, 9 = 3.250 }  A B C }  72.1 ± 5.13, 7.1% 73.0 ± 5.49, 7.5% 74.0 ± 6.08, 8.2% }  CI.66.8 - 77.37: 5.2 67.4 - 80.6: 5.64 65.7 - 82.2: 8.23 }  Ave(10s)= 73.03, s = 0.9300, C.V. = 1.3%, 99%C.I. = ± 5.33 }  99%CI = 67.7 - 78.4. SQRT(3) = 1.7321 t0.995,2 = 9.925

IVTPHL1012S1 33

DATA SET FOR SETTING SPECS. 3. SETS OF 10

Page 34: Statistical Tools for the Quality Control Laboratory and Validation Studies

}  n Ave. s C.V. 99%C.I. SQRT(n) t0.995,n-1 }  2 67.25 0.35 0.5 15.9 1.4142 63.66 }  3 70.17 5.06 7.2 42.8 1.1731 9.925 }  4 70.80 4.32 6.1 12.6 2.0000 5.841 }  5 71.26 3.88 5.4 8.0 2.2361 4.604 }  6 70.35 4.12 5.9 6.8 2.4495 4.032 }  9 70.93 3.78 5.3 4.2 3.0000 3.355 }  12 72.78 4.97 6.8 4.5 3.4641 3.106 }  18 72.74 5.40 7.4 3.7 4.2426 2.898 }  24 73.13 5.45 7.5 3.1 4.8990 2.807 }  30 73.03 5.44 7.5 2.7 5.4773 2.756

IVTPHL1012S1 34

DATA SET FOR SETTING SPECS. 4. CUMULATIVE

Page 35: Statistical Tools for the Quality Control Laboratory and Validation Studies

35

Wilcoxon’s Signed Rank Test 1.

l  Nonparametric test for paired test results. l  Does the same thing as the paired t-test but without the

assumption of normalcy. l  First, take your paired data and calculate the

differences, including their signs. l  Second, place the differences in order (low to high)

based on their absolute values. l  Third, assign a rank to the differences and assign to the

rank a sign according to the sign of the original difference. (continued)

Page 36: Statistical Tools for the Quality Control Laboratory and Validation Studies

36

Wilcoxon’s Signed Rank Test 2.

l  Fourth, count the number or positive or negative ranks, take the group with the smaller number of members, and sum the absolute values of the ranks in that group. This will give a value, Tn, where n = the number of pairs.

l  Go to a Wilcoxon table for n pairs and significance level of at least 95% to obtain a tabular value of Tn. For significance, the calculated value must be smaller than the tabular value for Tn.

Page 37: Statistical Tools for the Quality Control Laboratory and Validation Studies

37

Signed Rank Test: Example

l  A minimum of 6 pairs is needed. l  With 6 pairs, all of the differences must have the same

sign. This gives T6 = 0 which is significant at the 95% level.

l  Differences from 19 pairs of test results. l  Diff : +2, -4, -6, +8, +10, -11, -12, +13, +22, -25, l  Rank:+1, -2, -3, +4, +5, -6, -7, +8, +9, -10, l  Diff: -33, +33, +41, -45, +45, +45, +81, +92, +139 l  Rank:-11.5,+11.5,+13,-15, +15, +15, +17, +18, +19

Page 38: Statistical Tools for the Quality Control Laboratory and Validation Studies

38

Signed Rank Test: Example: Continued

l There are 7 negative ranks and 12 positive ranks, so the absolute sum is taken of:

l  -2, -3, -6, -7, -10, -11.5, and -15, this gives: l T19 = 54.5. The tabular value for T0.05, 19 is

46, so the data show no difference between the groups.

Page 39: Statistical Tools for the Quality Control Laboratory and Validation Studies

39

A Simpler Nonparametric Test 1.

l  The following is not as powerful as the Signed Rank Test, but is faster and easier. It tests the hypothesis that p = 0.5 for a given sign. It is a Chi-square (χ2) test.

( )21

2212 1nn

nn+

−−=χ

Page 40: Statistical Tools for the Quality Control Laboratory and Validation Studies

40

Simpler Signed Rank Test 2.

l  n1 and n2 are the number of positive and negative differences. From the previous data there are 12 positive and 7 negative differences so:

( )0.1

1916

194

7121712 22

2 <==+

−−=χ

Page 41: Statistical Tools for the Quality Control Laboratory and Validation Studies

41

Simpler Signed Rank Test 3.

l Usually, Χ2 > 1.0, so this indicates that there is no significance since the calculated Х2 should be larger than the tabular Χ2 for significance.

l This test can be adopted as a rapid and easy method to decide if further investigation is required. It is even possible to have prepared tables for use.

Page 42: Statistical Tools for the Quality Control Laboratory and Validation Studies

ValWkPHL1012S2 1

Basic Statistics for Quality Control and Validation Studies: Session 2

•  Steven S. Kuwahara, Ph.D.

•  GXP BioTechnology, LLC •  PMB 506, 1669-2 Hollenbeck Ave.

•  Sunnyvale, CA 94087-5402

•  Tel. & FAX (408) 530-9338 •  E-Mail: [email protected]

•  Website: www.gxpbiotech.org

Page 43: Statistical Tools for the Quality Control Laboratory and Validation Studies

2

Sample Number Determination 1.

•  One of the major difficulties with setting the number of samples to take lies in determining the levels of risk that are acceptable. It is in this area that managerial inaction is often found, leaving a QC supervisor or senior analyst to make the decision on the level of risk the company will accept. If this happens, management has failed its responsibility.

ValWkPHL1012S2

Page 44: Statistical Tools for the Quality Control Laboratory and Validation Studies

3

Sample Number Determination 2.

•  The problem is that all sampling plans, being statistical in nature, will possess some risk. For instance, if we randomly draw a new sample from a population we could assume or predict that a test result from that sample will fall within ±3σ of the true average 99.7% of the time, but there is still 0.3% (3 parts-per-thousand) of the time when the result will be outside the range for no reason other than random error. Thus a good lot could be rejected. This is known as a false positive or a Type I error.

•  This is the type of error that is most commonly considered, but there is type II error also.

ValWkPHL1012S2

Page 45: Statistical Tools for the Quality Control Laboratory and Validation Studies

4

Sample Number Determination 3.

•  False positives occur when you declare that there is a difference when one does not really exist (example given in the previous slide). Sometimes called producer’s risk, because the producer will dump a lot that was okay.

•  False negatives occur when you declare that a difference does not exist when, in fact, the difference does exist. Sometimes called customer’s risk, because the customer ends up with a defective product. It is also known as a Type II error.

ValWkPHL1012S2

Page 46: Statistical Tools for the Quality Control Laboratory and Validation Studies

5

SIMPLIFIED FORM OF n CALCULATION n for an to compare with a µ

( ) 2

222222

Δ=Δ=−=

−=⎟⎠⎞

⎜⎝⎛−

=

stnxnst

xn

stn

sxt ii

µ

µµ

ValWkPHL1012S2

Page 47: Statistical Tools for the Quality Control Laboratory and Validation Studies

6

EXAMPLE OF SIMPLIFIED METHOD WITH ITERATION

•  Δ = 51- 50 = 1 s = ± 2 Z0.025=1.96 •  n = (1.96)2 (2)2 / 1 = 3.8416 X 4 = 15.4 ~ 16 •  t0.025,15= 2.131 (2.131)2 = 4.541161 •  n = 4.54116 X 4 = 18.16 ~ 19 •  t0.025,18= 2.101 (2.101)2 = 4.414201 •  n = 4.414201 X 4 = 17.66 ~ 18 •  t0.025,17= 2.110 (2.110)2 = 4.4521 •  n = 4.4521 X 4 = 17.81 ~ 18

ValWkPHL1012S2

Page 48: Statistical Tools for the Quality Control Laboratory and Validation Studies

7

Sample Number Determination 6.

•  Because of the need to define risk and consider the level of variation that is present, sampling plans that do not allow for these factors are not valid.

•  Examples of these are: Take 10% of the lot below N=200 and then 5% thereafter. The more famous one is to take :

•  in samples. 1+N

ValWkPHL1012S2

Page 49: Statistical Tools for the Quality Control Laboratory and Validation Studies

DEVELOPMENT OF A SAMPLING PLAN

•  Consider a situation where a product must contain at least 42 mg/mL of a drug. At 41 mg/mL the product fails. Because we want to allow for the test and product variability, we decide that we want a 95% probability of accepting a lot that is at 42 mg/mL, but we want only a 1% chance of accepting a lot that is at 41 mg/mL.

•  For the sampling plan we need to know the number (n) of test results to take and average.

•  We will accept the lot if the average () exceeds k mg/mL.

8 ValWkPHL1012S2

Page 50: Statistical Tools for the Quality Control Laboratory and Validation Studies

SAMPLING PLAN CALCULATIONS A. You will need the table of the normal distribution for this.

• Suppose we have a lot that is at 42.0 mg/mL. •  would be normally distributed with µ=42.0

– And the SEM = s/n. We want >k

From a “normal” table (or “x” with ν = ∞) we want a probability of 0.95 that “x” will be greater than the “k” expression.

deviate normal standard x

0.420.42

=

−>

−=

ns

k

ns

xx

9 ValWkPHL1012S2

Page 51: Statistical Tools for the Quality Control Laboratory and Validation Studies

SAMPLING PLAN CALCULATIONS A1. You will need a normal distribution table for this

•  x0.95,∞ = 1.645 (cumulative probability of 0.95) •  We know that this must be greater than the “k”

expression. •  We also know that k must be less than 42.0 since

the smallest acceptable will be 42.0. •  Therefore:

xk since 645.10.42<=

ns

k

10 ValWkPHL1012S2

Page 52: Statistical Tools for the Quality Control Laboratory and Validation Studies

SAMPLING PLAN CALCULATIONS B.

• Now suppose that the correct value for the lot is 41.0 mg/mL. So now µ = 41.0 and we want a probability of 0.01 that >k. Now:

59.41

707.0326.2645.1

0.410.42

326.20.410.41

=

−=−

=−

−=−

>−

=

k

kk

ns

k

ns

xx

11 ValWkPHL1012S2

Page 53: Statistical Tools for the Quality Control Laboratory and Validation Studies

SAMPLING PLAN CALCULATIONS C.

• Going back to the original equation for a passing result and knowing that s = ± 0.45 (From our assay validation studies?)

( ) [ ][ ]( )( )

24.31681.0

544644.041.0

45.064.1nor 41.0]64.1[

64.141.00.4259.410.42

2

2

==

−=−=

−=−

=−

=−

n

ns

ns

ns

ns

k

12 ValWkPHL1012S2

Page 54: Statistical Tools for the Quality Control Laboratory and Validation Studies

SAMPLING PLAN

•  The sampling plan now says: To have a 95% probability of accepting a lot at 42.0 mg/mL or better and a 1% probability of accepting a lot at 41.0 mg/mL or worse, given a standard deviation of ± 0.45 mg/mL for the test method; run four samples and average them. Accept the lot if the mean is 41.59 mg/mL or better.

•  Note that the calculated value of n is close enough to 3 that some would argue for 3 samples.

13 ValWkPHL1012S2

Page 55: Statistical Tools for the Quality Control Laboratory and Validation Studies

SAMPLE SIZES FOR MEANS

• Suppose we want to determine µ using a test where we know the standard deviation (s) of the population. • How many replicates will we need in the sample? • The length of a confidence interval = L

Δ==== 2L 4tn 4L 22

22222

Ls

nst

ntsL

14 ValWkPHL1012S2

Page 56: Statistical Tools for the Quality Control Laboratory and Validation Studies

Recalculation of Earlier Problem.

L = 2, s = ±2, t0.95,∞=1.960 (two sided)

2

224tnLs

=

( ) ( )( )

( ) ( )( ) 18n so 17.81n 110.2t

17.66n 101.2t 18.16,n 131.2t :Iterate16or 4.15

44656.61

296.124

17,95.0

18,95.015,95.0

2

22

===

====

=

==

n

n

15 ValWkPHL1012S2

Page 57: Statistical Tools for the Quality Control Laboratory and Validation Studies

Sample size for estimating µ

• Note the statement: We are determining the % of drug present and we wish to bracket the true amount (µ%) by ± 0.5% and do this with 95% confidence, so L = 2 x 0.5 = 1.0 • We have 22 previous estimates for which s = 0.45 • Now at the 95% level of significance (1–0.95), t0.975,21 = 2.080.

( ) ( )( )

5.30.1

45.0080.242

22

==n

16 ValWkPHL1012S2

Page 58: Statistical Tools for the Quality Control Laboratory and Validation Studies

17

POOLED VARIANCE

( ) ( )211

21

222

211

−+−+−

=nn

snsnsp

ValWkPHL1012S2

Page 59: Statistical Tools for the Quality Control Laboratory and Validation Studies

Calculating the Confidence Interval, Sp

• The results of the four determinations are: 42.37%, 42.18%, 42.71%, 42.41%. •  = 42.42% and s = 0.22% (n2 – 1) = 3 • Using the extra 3 df and s = 0.22% we have:

( ) ( ) 43.032122.0345.021 22

=+

+=pS

18 ValWkPHL1012S2

Page 60: Statistical Tools for the Quality Control Laboratory and Validation Studies

Calculating the Confidence Interval, L

• Sp = s, the new estimate of the standard deviation, so a new confidence interval can be calculated with 24 df. t(0.975, 24)= 2.064.

( )( )

( )

L. gcalculatinfor 25not 4n that Note42.87 - 41.97or 0.4542.42 C.I.

0.45or 44376.02 C.I.

1.0.n rather tha ,88752.04

43.0064.22

95%

=

±=

±=±=

=

=

LL

L

19 ValWkPHL1012S2

Page 61: Statistical Tools for the Quality Control Laboratory and Validation Studies

Sample Sizes for Estimating Standard Deviations. I.

•  The problem is to choose n so that s at n – 1 will be within a given ratio of s/σ.

•  Examples are found in reproducibility, repeatability, and intermediate precision measurements.

•  s = standard deviation experimentally determined. σ = population or true standard deviation. s2 and σ2 are corresponding variances.

•  You will use n to derive s.

20 ValWkPHL1012S2

Page 62: Statistical Tools for the Quality Control Laboratory and Validation Studies

Sample Sizes for Estimating Standard Deviations. χ2

• This is the asymmetric distribution for σ2. • Now as an example, assume n-1 = 12. At 12 df, χ2 will exceed 21.0261 5% of the time and it will exceed 5.2260 95% of the time. Therefore 90% of the time, χ2 will lie between 5.2260 and 21.0261 for 12 df. • Check your tables to confirm this.

( )

( )

( )⎟⎟⎠

⎞⎜⎜⎝

−=⎟

⎞⎜⎝

=−

−=

1

1

1

21

2

2

221

2

221

ns

sn

sn

n

n

n

χσ

σχ

σχ

21 ValWkPHL1012S2

Page 63: Statistical Tools for the Quality Control Laboratory and Validation Studies

Confidence interval for the standard deviation.

•  Given the data in the previous slide, we know that (s2/σ2) will lie between (5.2260/12) and (21.0261/12), or between 0.4355 and 1.7552.

•  Thus the ratio of s/σ will lie between the square roots of these numbers or between 0.66 and 1.32 or 0.66 < s/σ < 1.32. This gives:

•  s/1.32 < σ < s/0.66. If you know s this gives you a 90% confidence interval for the standard deviation.

•  Now let’s reverse our thinking. 22 ValWkPHL1012S2

Page 64: Statistical Tools for the Quality Control Laboratory and Validation Studies

Sample Sizes for Estimating Standard Deviations. Continued. I.

•  Instead of the confidence interval, suppose we say that we want to determine s to be within ± 20% of σ with 90% confidence. So:

•  1 – 0.2 < s/σ < 1+ 0.2 or 0.8 < s/σ < 1.2 •  This is the same as: 0.64 < (s/σ)2 < 1.44 •  Since we want 90% confidence we use levels of

significance at 0.05 and 0.95. •  Now go to the χ2 table under the 0.95 column and

look for a combination where χ2/df is not < 0.64, but df is as large as possible.

23 ValWkPHL1012S2

Page 65: Statistical Tools for the Quality Control Laboratory and Validation Studies

Sample Sizes for Estimating Standard Deviations. Continued. II.

•  Trial and error shows this number to be about 50. •  Next we go to the column under 0.05 and look for

a ratio that does not exceed 1.44, but df is as small as possible.

•  Trial and error will show this number to be between 30 and 40.

•  You must take the larger of the two numbers and since df = n – 1, n = 51 replicates.

24 ValWkPHL1012S2

Page 66: Statistical Tools for the Quality Control Laboratory and Validation Studies

Do Not Panic. Consider This!

•  Instead of the confidence interval, suppose we say that we want to determine s to be within ± 50% of σ with 95% confidence. So:

•  1 – 0.5 < s/σ < 1+ 0.5 or 0.5 < s/σ < 1.5 •  This is the same as: 0.25 < (s/σ)2 < 2.25 •  Since we want 95% confidence we use levels of

significance at 0.025 and 0.975. •  Now go to the χ2 table under the 0.975 column and

look for a combination where χ2/df is not < 0.25, but df is as large as possible.

25 ValWkPHL1012S2

Page 67: Statistical Tools for the Quality Control Laboratory and Validation Studies

Greater Confidence, But Lesser Certainty

•  Trial and error shows this number to be 8. •  Next we go to the column under 0.025 and look for

a ratio that does not exceed 2.25, but df is as small as possible.

•  Trial and error will show this number to be 8. The same as the other df.

•  You must take the larger of the two numbers and but in this case df = 8 and n = 9.

•  You have a greater confidence interval for a smaller n.

26 ValWkPHL1012S2

Page 68: Statistical Tools for the Quality Control Laboratory and Validation Studies

n for Comparing Two Averages

ValWkPHL1012S2 27

( )

( )2

22

21

2,

222

21

2df,

22

21

22

.

2121

2

22

1

21

21,

nt

n x

Δ

+=

Δ=++

Δ=

=−=Δ

+

−=

σσ

σσσσ

σσ

α

αα

α

df

df

df

tn

nn

t

nx

nn

xxt

Page 69: Statistical Tools for the Quality Control Laboratory and Validation Studies

Introduction to the Analysis of Variance (ANOVA) I.

This method was aimed at deciding whether or not differences among averages were due to experimental or natural variations or true differences among averages. R.A. Fisher developed a method based on comparing the variances of the treatment means and the variances of the individual measurements that generated the means. The technique has been extended into the field known as DOE or factorial experiments

28 ValWkPHL1012S2

Page 70: Statistical Tools for the Quality Control Laboratory and Validation Studies

Introduction to the Analysis of Variance (ANOVA) II.

•  The method is based on the use of the F-test and the F-distribution (Named after him.) –  The F-distribution, and all distributions related to

errors, is a skewed, unsymmetrical distribution.

–  S2y represents the variance among the treatments and

s2pooled is the variance of the individual results (system

noise).

2

2

pooled

y

sns

F =

29 ValWkPHL1012S2

Page 71: Statistical Tools for the Quality Control Laboratory and Validation Studies

Introduction to the Analysis of Variance (ANOVA) III.

•  F increases as the number of replicates increases. –  In simple ANOVA systems n is the same for all

treatments. –  By increasing n you amplify small differences between

the variances of the treatment means and the system noise.

–  An F value of 1.0 or less says that the system noise is greater than the variance of the means. This suggests that the differences among the means are due to experimental or environmental variations.

30 ValWkPHL1012S2

Page 72: Statistical Tools for the Quality Control Laboratory and Validation Studies

Introduction to the Analysis of Variance (ANOVA) IV.

•  Because of the importance of system noise, before doing an ANOVA or factorial experiment, you should reduce variation in the system to a minimum. –  You should remove all special cause variation and

minimize common cause variation. –  Methods such as Statistical Process Control (SPC)

should be used to reduce variations. •  Note: A system where special cause variation has been

eliminated and only common cause variation is left is known as a system under statistical control.

31 ValWkPHL1012S2

Page 73: Statistical Tools for the Quality Control Laboratory and Validation Studies

Introduction to the Analysis of Variance (ANOVA) V.

•  The F-distribution depends on the number of degrees of freedom of the numerator and denominator and the level of type 1 error that you will accept. –  For each level of type 1 error there are different

distribution tables. The exact value of F then depends on the number of degrees of freedom of the numerator and denominator.

•  If the calculated F exceeds the tabular F, it is then significant at the1-α level. Where α is the level of type 1 error that you are willing to accept.

•  α is the p value. Most statistical software programs will calculate the p value. Normally, you want 0.05 or 0.01.

•  Type-1 error is where you falsely conclude that there is a difference. AKA: False positive, producer’s risk.

ValWkPHL1012S2 32

Page 74: Statistical Tools for the Quality Control Laboratory and Validation Studies

Fairness of 4 sets of dice. (Taken from Anderson, MJ and Whitcomb, PJ, DOE Simplified, CRC Press, Boca Raton, FL, 2007.)

•  Frequency distribution for 56 rolls of dice.

•  Grand average = Total of all dots/56 dice (4X14) ValWkPHL1012S2 33

Dots White Blue Green Purple

6 6+6 6+6 6+6 6 5 5 5 5 5 4 4 4+4 4+4 4 3 3+3+3+3+3 3+3+3+3 3+3+3+3 3+3+3+3+3 2 2+2+2 2+2+2+2 2+2+2+2 2+2+2+2+2 1 1+1 1 1 1

Mean (y) 3.14 3.29 3.29 2.93 Var. (s2) 2.59 2.37 2.37 1.76 n = 14 Grand Ave. = 3.1625

Page 75: Statistical Tools for the Quality Control Laboratory and Validation Studies

Fairness of 4 sets of dice. Calculation of F. Note differences in denominator.

Since F is much less than 1.0 we can assume that there is no significant difference among the colors even without looking

at an F table.

ValWkPHL1012S2 34

( ) ( ) ( ) ( )

18.028.2029.0*14*

28.2476.137.237.259.2

029.014

1625.393.21625.329.31625.329.31625.314.3

2

2

2

2

22222

===

=++++=

=−

−+−+−+−=

pooled

y

pooled

y

y

ssn

F

s

s

s

Page 76: Statistical Tools for the Quality Control Laboratory and Validation Studies

Fairness of 4 sets of dice. How about a loaded set?

Dots White Blue Green Purple 6 1 3 6 1 5 1 2 5 2 4 1 3 1 3 3 2 4 1 1 2 5 1 0 2 1 4 1 1 5

Mean (y) 2.50 3.93 4.93 2.86 Var. (s2) 2.42 2.38 2.07 3.21 n = 14 δ = δ2 = Σδ2 =

Grand Ave. -1.055 1.1130 3.6245

= 3.555 0.375 0.1406

Σδ2/3 = s2y =

1.375 1.8906 1.2082

-0.695 0.4830

ValWkPHL1012S2 35

Page 77: Statistical Tools for the Quality Control Laboratory and Validation Studies

Fairness of 4 sets of dice. How about a loaded set? ANOVA

ValWkPHL1012S2 36

0.001 pat t Significan .F toFfor is 0.1%.at 6.171 - 6.595 and 1%,at 4.126 - 4.313 and

0.05p 5%,at 758.2839.2FTabular

71.6 71.652.2

21.1*14

1)-(4 3 df 21.1

521)-4(14df 52.24

21.307.238.242.2

3,603,40

52,3

52,3

2

2

=

=−=

===

==

===+++

=

Range

FF

s

s

y

pooled

Page 78: Statistical Tools for the Quality Control Laboratory and Validation Studies

Least Significant Difference Lucy in the Sky with Diamonds (LSD)

•  DO NOT EVER USE THIS METHOD WITHOUT THE PROTECTION OF A SIGNIFICANT ANOVA RESULT ! ! !

•  There are 45 combinations of 10 results taken in pairs. If you focus mainly on the high and low results, you are almost guaranteed to encounter a type-1 error. –  This is why you need to use the ANOVA coupled with an LSD

determination. •  The LSD is based on the equations for confidence intervals.

ValWkPHL1012S2 37

( ) ns

nstLSDni

pooleddf∑=×±= −

12

pooled,1 s /2 α

Page 79: Statistical Tools for the Quality Control Laboratory and Validation Studies

LSD for the Current Problem

•  The (1-α) level of the t determines the level of significance for the LSD.

•  n = 14 for replicates, but s2pooled had 4X(14-1)

= 52 df.

ValWkPHL1012S2 38

( ) 68.2for t 1.333LSD 99%at

21.114259.101.2

59.152.24

21.307.238.242.2

52df0.99, ≅±=

±=×=

==+++

=

=

LSD

spooled

Page 80: Statistical Tools for the Quality Control Laboratory and Validation Studies

So where are the bad dice?

•  Given the LSD = ±1.333, the result can be displayed in different ways.

•  Plot the result as the mean of the average count of the treatments (colors) ± ½ LSD. –  Then look for overlaps. A significant difference will not have

an overlap. •  Or take the difference between means and compare

them to the LSD. –  In the present case, the white and purple dice are similar, but

the green dice are definitely higher, with the blue dice different from the white, but not from the green and only marginally different from the purple.

ValWkPHL1012S2 39

Page 81: Statistical Tools for the Quality Control Laboratory and Validation Studies

For 95% confidence, the LSD is ± 1.21 and for 99%, the LDS is ± 1.33. So blue and green are different from white, and green is different from purple and white at the 99% level. White and purple are the same as are blue and green. Purple is also similar to blue, but not to green. All of this holds at the 99% level, thus at p = 0.01 we conclude that blue and green dice run to higher numbers than white and purple.

ValWkPHL1012S2 40

White = 2.50 Blue = 3.93 Green=4.93 Purple=2.86 White = 2.50 1.43 2.43 0.36 Blue = 3.93 1.43 1.00 1.07 Green=4.93 2.43 1.00 2.07 Purple=2.86 0.36 1.07 2.07

Page 83: Statistical Tools for the Quality Control Laboratory and Validation Studies

2

“Designing  an  efficient  process  with  an  effec;ve  process  control  approach  is  dependent  on  the  process  knowledge  and  understanding  obtained.  Design  of  Experiment  (DOE)  studies  can  help  develop  process  knowledge  by  revealing  rela;onships,  including  mul;-­‐factorial  interac;ons,  between  the  variable  inputs  …  and  the  resul;ng  outputs.      Risk  analysis  tools  can  be  used  to  screen  poten;al  variables  for  DOE  studies  to  minimize  the  total  number  of  experiments  conducted  while    maximizing  knowledge  gained.      The  results  of  DOE  studies  can  provide  jus;fica;on  for  establishing  ranges  of  incoming  component  quality,  equipment  parameters,  and  in  process  material  quality  aKributes.”  

Page 84: Statistical Tools for the Quality Control Laboratory and Validation Studies

3

What  is  it?  The  ability  to  accurately  predict/control  process  responses.    

How  do  we  acquire  it?  Scien;fic  experimenta;on  and  modeling.    

How  do  we  communicate  it?  Tell  a  compelling  scien;fic  story.  Give  the  prior  knowledge,  theory,  assump;ons.  Show  the  model.  Quan;fy  the  risks,  and  uncertain;es.    Outline  the  boundaries  of  the  model.  Use  pictures.  Demonstrate  predictability.  

Page 85: Statistical Tools for the Quality Control Laboratory and Validation Studies

4

Screening  Designs  •   2  level  factorial/  frac;onal  factorial  designs      •   Weed  out  the  less  important  factors  •   Skeleton  for  a  follow-­‐up  RSM  design  

Response  Surface  Designs  •   3+  level  designs      •   Find  design  space  •   Explore  limits  of  experimental  region  

Confirmatory  Designs  

•     Confirm  Findings  •     Characterize  Variability  

Page 86: Statistical Tools for the Quality Control Laboratory and Validation Studies

5 Cau;on:  EVERYTHING  depends  on  gecng  this  right  !!!  

Key  Factors   Key  

Responses  

Page 87: Statistical Tools for the Quality Control Laboratory and Validation Studies

6

Make  ACE  

Tablets  

Disint  (A  or  B)  

Drug%  (5-­‐15%)  

Disint%  (1-­‐4%)  

DrugPS  (10-­‐40%)  

Lub%  (1-­‐2%)  

Dissolu;on%  (>90%)        WeightRSD%(<2%)    

Fixed  Factors   Responses  

Day  

Random  Factors  

Page 88: Statistical Tools for the Quality Control Laboratory and Validation Studies

7

Trial   DrugPS   Lub%    

Disso%  

1   25   1   85  2   25   2   95  3   10   1.5   90  4   40   1.5   70  

DrugPS  

Lubricant%

 

85  

95  

70  90  

10   40  1  

2  

Page 89: Statistical Tools for the Quality Control Laboratory and Validation Studies

8

=

+ ×

− ×

Disso% 86.66710 Lub%0.667 DrugPS

DrugPS  

Lubricant%

 

85  

95  

70  90  

10   40  1  

2  

Page 90: Statistical Tools for the Quality Control Laboratory and Validation Studies

9

�  Previous  example  had  only  2  factors.  Ø Factor  space  is  2D.  We  can  visualize  on  paper.  

�  With  3  factors  we  need  3D  paper.  Ø Corners  even  further  away  

�  Most  new  processes  have  >3  factors  �  OFAT  can  only  accommodate  addi;ve  models  �  We  need  a  more  efficient  approach  

Page 91: Statistical Tools for the Quality Control Laboratory and Validation Studies

10

True  response   • Goal:  Maximize  response  

• Fix  Factor  2  at  A.  • Op;mize  Factor  1  to  B.  

• Fix  Factor  1  at  B.  • Op;mize  Factor  2  to  C.  

• Done?    True  op;mum  is  Factor  1  =  D  and    Factor  2  =  E.  

• We  need  to  accommodate  curvature  and  interac/ons  

A  

Factor  1  

Factor  2  

B  

C  

D  

E  80  60  40  

Page 92: Statistical Tools for the Quality Control Laboratory and Validation Studies

11

Response  

Factor  level  A   B   C   D  

•  A  to  B  may  give  poor  signal  to  noise  •  A  to  C  gives  beKer  signal  to  noise  and  rela;onship  is  s;ll  nearly  linear  

•  A  to  D  may  give  poor  signal  to  noise  and  completely  miss  curvature  

•  Rule  of  thumb:  Be  bold  (but  not  too  bold)  

Page 93: Statistical Tools for the Quality Control Laboratory and Validation Studies

12

Trial   DrugPS   Lub%    

Disso%  

1   10   1   75  2   10   2   100  3   40   1   75  4   40   2   80  

DrugPS  

Lubricant%

 

75  

80  

75  

100  

10   40  1  

2  

Page 94: Statistical Tools for the Quality Control Laboratory and Validation Studies

13

DrugPS  

Lubricant%

 

75  

80  

75  

100  

10   40  1  

2  

=

+ ×

+ ×

− × ×

Disso% 43.330.667 DrugPS31.667 Lub%0.667 DrugPS Lub%

Page 95: Statistical Tools for the Quality Control Laboratory and Validation Studies

14

� Model  non-­‐addiKve  behavior  

›  interacKons,  curvature  

� Efficiently  explore  the  factor  space  

� Take  advantage  of  hidden  replicaKon  

Page 96: Statistical Tools for the Quality Control Laboratory and Validation Studies

15

Planar:  no  interac;on  

1 2Y a b X c X= + ⋅ + ⋅

Non-­‐planar:  interac;on  

1 2 1 2Y a b X c X d X X= + ⋅ + ⋅ + ⋅ ⋅

Page 97: Statistical Tools for the Quality Control Laboratory and Validation Studies

16

Page 98: Statistical Tools for the Quality Control Laboratory and Validation Studies

17

Page 99: Statistical Tools for the Quality Control Laboratory and Validation Studies

18

Page 100: Statistical Tools for the Quality Control Laboratory and Validation Studies

19

DrugPS  

Lub%

 

C  

B  

D  

A  

10   40  1  

2  

DrugPS

Lub%

DrugPS Lub%

B D A CMainEffect2 2

A B C DMainEffect2 2

C B A DInteractionEffect2 2×

+ += −

+ += −

+ += −

C  B  D  

A  

C  B  D  

A  C  

B  D  

A  

Trial   DrugPS   Lub%   Disso%  1   10   1   C  2   10   2   A  3   40   1   D  4   40   2   B  

Page 101: Statistical Tools for the Quality Control Laboratory and Validation Studies

20

Trial   DrugPS   Lub%  1   10   1  2   10   2  3   40   1  4   40   2  

Trial   DrugPS   Lub%  1   -­‐1   -­‐1  2   -­‐1   +1  3   +1   -­‐1  4   +1   +1  

Uncoded  Units   Coded  Units  

•  Coding  helps  us  evaluate  design  proper;es  •  Some  sta;s;cal  tests  use  coded  factor  units  for  analysis  

(automa;cally  handled  by  sotware)  •  Easy  to  convert  between  coded  (C)  and  uncoded  (U)  factor  levels  

midmax mid mid

max mid

U UC U C(U U ) U

U U−

= ⇔ = − +−

Page 102: Statistical Tools for the Quality Control Laboratory and Validation Studies

21

Disso ab Lub%c DrugPSd Lub% DrugPS

=

+ ×

+ ×

+ × ×

DrugPS

Lub%

DrugPS Lub%

a ( A B C D) / 4b ME /2 ( A B C D) / 4

c ME /2 ( A B C D) / 4d IE /2 ( A B C D) / 4×

= + + + +

= = − + − +

= = + + − −

= = − + + −

DrugPS  

Lub%

 

C  

B  

D  

A  

-­‐1   +1  -­‐1  

+1   Trial   DrugPS   Lub%    

DrugPS*Lub%  

Disso%  

1   -­‐1   -­‐1   +1   C  2   -­‐1   +1   -­‐1   A  3   +1   -­‐1   -­‐1   D  4   +1   +1   +1   B  

Page 103: Statistical Tools for the Quality Control Laboratory and Validation Studies

22

Disso a b Lub c DrugPS d Lub DrugPS= + × + × + × × + ε

�  It  is  obtained  through  the  “magic”  of  regression.  

�  b  measures  the  “main  effect”  of  Lub  

�  c  measures  the  “main  effect”  of  DrugPS  

�  d  measures  the  “interac;on  effect”  between  Lub  and  DrugPS  

Ø  if  d  =  0,  effects  of  Lub  and  DrugPS  are  addi;ve  

Ø  if  d  ≠  0,  effects  of  Lub  and  DrugPS  are  non-­‐addi;ve  

�  ε  represents  trial  to  trial  random  noise  

Page 104: Statistical Tools for the Quality Control Laboratory and Validation Studies

23

Trial   DrugPS   Lub%  1   -­‐1   -­‐1  2   -­‐1   +1  3   +1   -­‐1  4   +1   +1  

DrugPS  

Lub%

 

-­‐1   +1  -­‐1  

+1  

Trial   DrugPS   Lub%  1   -­‐1   -­‐1  2   -­‐1   -­‐1  3   +1   +1  4   +1   +1  

DrugPS  

Lub%

 

-­‐1   +1  -­‐1  

+1  

Inner  product:            +1-­‐1-­‐1+1=0                                                +1+0+0+1=2                                        +1+1+1+1=4  

Trial   DrugPS   Lub%  1   -­‐1   -­‐1  2   -­‐1   0  3   +1   0  4   +1   +1  

DrugPS  

Lub%

 

-­‐1   +1  -­‐1  

+1  

Page 105: Statistical Tools for the Quality Control Laboratory and Validation Studies

24

Page 106: Statistical Tools for the Quality Control Laboratory and Validation Studies

25

10   40  

DrugPS  

Dissolu;

on  (%

LC)  

2%  Lubricant  

1%  Lubricant  

90  

Page 107: Statistical Tools for the Quality Control Laboratory and Validation Studies

26

Number  of  Factors  (k)  

Number  of  Trials  (df  =  

2k)  0   1  1   2  2   4  3   8  4   16  5   32  6   64  

•  Average  •  Main  Effects  •  2-­‐way  interac;ons  •  Higher  order  

interac;ons  (or  es;mates  of  noise)  

y a bA cB dC eAB fAC gBC hABC= + + + + + + + + ε

Page 108: Statistical Tools for the Quality Control Laboratory and Validation Studies

27

Trial   I   A   B   C   D=AB   E=AC   F=BC   ABC  1   +   -­‐   -­‐   -­‐   +   +   +   -­‐  2   +   +   -­‐   -­‐   -­‐   -­‐   +   +  3   +   -­‐   +   -­‐   -­‐   +   -­‐   +  4   +   +   +   -­‐   +   -­‐   -­‐   -­‐  5   +   -­‐   -­‐   +   +   -­‐   -­‐   +  6   +   +   -­‐   +   -­‐   +   -­‐   -­‐  7   +   -­‐   +   +   -­‐   -­‐   +   -­‐  8   +   +   +   +   +   +   +   +  

y a bA cB dC eD fE gF= + + + + + + + ε

Main Effects

•  Can  include  addi;onal  variables  in  our  experiment  by  aliasing  with  interac;on  columns.  

•  Leave  some  columns  to  es;mate  residual  error  for  sta;s;cal  tests  

Page 109: Statistical Tools for the Quality Control Laboratory and Validation Studies

28

A

B

C

-1 +1 -1

+1

+1

-1

y a bA cB dC= + + +

Trial   I   A   B   C   AB   AC   BC   ABC  1   +   -­‐   -­‐   -­‐   +   +   +   -­‐  2   +   +   -­‐   -­‐   -­‐   -­‐   +   +  3   +   -­‐   +   -­‐   -­‐   +   -­‐   +  4   +   +   +   -­‐   +   -­‐   -­‐   -­‐  5   +   -­‐   -­‐   +   +   -­‐   -­‐   +  6   +   +   -­‐   +   -­‐   +   -­‐   -­‐  7   +   -­‐   +   +   -­‐   -­‐   +   -­‐  8   +   +   +   +   +   +   +   +  

•  Create  a  half  frac;on  by  running  only  the  ABC  =  +1  trials  •  Note  confounding  between  main  effects  and  interac;ons  •  Compromise:  must  assume  interac;ons  are  negligible  •  In  this  case  (not  always)  design  is  “saturated”  (no  df  for  sta;s;cal  

tests).  

Page 110: Statistical Tools for the Quality Control Laboratory and Validation Studies

29

• “I=ABC”  for  this  23-­‐1  half  frac;on  is  called  the  “Defining  Rela;on”  • Note  that  “I=ABC”  implies  that  “A=BC”,  “B=AC”,  and  “C=AB”.  

• 3-­‐way  interac;ons  are  confounded  with  the  intercept  • Main  effects  are  confounded  with  2-­‐way  interac;ons  • The  number  of  factors  in  a  defining  rela;on  is  called  the  “Resolu;on”  

• This  23-­‐1  half  frac;on  has  resolu;on  III  • We  denote  this  frac;onal  factorial  design  as  2III3-­‐1  

Page 111: Statistical Tools for the Quality Control Laboratory and Validation Studies

30

We  like  our  screening  designs  to  be  at  least  resolu;on  IV  (I=ABCD)  

•  I=ABCD  for  this  24-­‐1  half  frac;on  is  called  the  Defining  Rela;on  • Note  that  I=ABCD  implies  

•   A=BCD,  B=ACD,  C=ABD,  and  D=ABC.  •   AB=CD,  AC=BD,  AD=BC  

•   Main  effects  are  confounded  with  3-­‐way  interac;ons  •   Some  2-­‐way  interac;ons  are  confounded  with  others.  

Page 112: Statistical Tools for the Quality Control Laboratory and Validation Studies

31

Number  of  Factors  

2   3   4   5   6   7   8   9   10   11   12   13   14   15  

Num

ber  o

f  Design  Po

ints  

4   Full   III                                                  

6       IV                                                  

8       Full   IV   III   III   III                                  

12           V   IV   IV   III   III   III   III   III                  

16           Full   V   IV   IV   IV   III   III   III   III   III   III   III  

20                                       III   III   III   III   III  

24                               IV   IV   IV   IV   III   III   III  

32               Full   VI   IV   IV   IV   IV   IV   IV   IV   IV   IV  

48                       V   V                              

64                   Full   VII   V   IV   IV   IV   IV   IV   IV   IV  

96                               V   V   V                  

128                       Full   VIII   VI   V   V   IV   IV   IV   IV  

Page 113: Statistical Tools for the Quality Control Laboratory and Validation Studies

32

Trial   DrugPS   Lub%  

Disso%  

1   10   1   76  2   10   2   98  3   40   1   73  4   40   2   82  5   10   1   84  6   10   2   102  7   40   1   77  8   40   2   88  

DrugPS

Lub%

76,84

88,82

73,77

98,102

10 40 1

2

FiKed  model  is  based  on  averages  individual

averageSD

SDnumber of replicates

=

Page 114: Statistical Tools for the Quality Control Laboratory and Validation Studies

33

Repeated  measurement   1  batch  

3  measurements  per    batch  

ReplicaKng  batch  producKon  

3  batches  1  measurement  

per    batch  

Page 115: Statistical Tools for the Quality Control Laboratory and Validation Studies

34

ReplicaKon  1.   Every  operaKon  that  

contributes  to  variaKon  is  redone  with  each  trial.  

2.  Measurements  are  independent.  

3.   Individual  responses  are  analyzed.  

RepeKKon  1.   Some  operaKons  that  

contribute  variaKon  are  not  redone.  

2.  Measurements  are  correlated.  3.   The  averages  of  the  repeats  

should  be  analyzed  (usually).  

Trial   DrugPS   Lub%  

Disso%  

1   10   1   76  2   10   2   98  3   40   1   73  4   40   2   82  5   10   1   84  6   10   2   102  7   40   1   77  8   40   2   88  

Trial   DrugPS   Lub%  

Disso%  

1   10   1   76, 84  2   10   2   98, 102  3   40   1   73, 77  4   40   2   82, 88  

Page 116: Statistical Tools for the Quality Control Laboratory and Validation Studies

35

� Frac;onal  factorial  designs  are  generally  used  for  “screening”  

� Sta;s;cal  tests  (e.g.,  t-­‐test)  are  used  to  “detect”  an  effect.  

� The  power  of  a  sta;s;cal  test  to  detect  an  effect  depends  on  the  total  number  of  replicates  =  (trials/design)  x  (replicates/trial)  

�  If  our  experiment  is  under  powered,  we  will  miss  important  effects.  

�  If  our  experiment  is  over-­‐powered,  we  will  waste  resources.  

� Prior  to  experimen;ng,  we  need  to  assess  the  need  for  replica;on.  

Page 117: Statistical Tools for the Quality Control Laboratory and Validation Studies

36

( )22

11 2N (#points  in  design)(replicates/point) 4 z zα −β−

σ⎛ ⎞= ≅ + ⎜ ⎟δ⎝ ⎠

•  While  not  exact,  this  ROT  is  easy  to  apply  and  useful.  

•  Commercial  sotware  will  have  more  accurate  formulas.  

α z1-­‐α/2  0.01   2.58  0.05   1.96  0.10   1.65  

β z1-­‐β 0.1   1.28  0.2   0.85  0.5   0.00  

σ  =  replicate  SD  δ    =  size  of  effect  (high  –  low)  to  be  detected.  α  =  probability  of  false  detec;on  β  =  probability  of  failure  to  detect  an  effect  of  size  δ

2

N 16 σ⎛ ⎞≅ ⎜ ⎟δ⎝ ⎠

Page 118: Statistical Tools for the Quality Control Laboratory and Validation Studies

37

Disso%   WtRSD  Replicate  SD   σ 1.3   0.1  

Difference  to  detect   δ 2.0   0.2  False  detecKon  probability   α 0.05   0.05  

z1-­‐α/2   1.96   1.96  DetecKon  failure  probability   β 0.2   0.2  

z1-­‐β 0.85   0.85  Required  number  of  trials   N   13.3   8  

( )22

11 2N (#points  in  design)(replicates/point) 4 z zα −β−

σ⎛ ⎞= ≅ + ⎜ ⎟δ⎝ ⎠

Page 119: Statistical Tools for the Quality Control Laboratory and Validation Studies

38

Run A B C D E 1 - - - - + 2 + - - - - 3 - + - - - 4 + + - - + 5 - - + - - 6 + - + - + 7 - + + - + 8 + + + - - 9 - - - + - 10 + - - + + 11 - + - + + 12 + + - + - 13 - - + + + 14 + - + + - 15 - + + + - 16 + + + + +

Confounding Table I = ABCDE A = BCDE B = ACDE C = ABDE D = ABCE E = ABCD AB = CDE AC = BDE AD = BCE AE = BCD BC = ADE BD = ACE BE = ACD CD = ABE CE = ABD DE = ABC

Page 120: Statistical Tools for the Quality Control Laboratory and Validation Studies

39

�  Sta;s;cal    test  for  presence  of  curvature  (lack  of  fit)  �  Addi;onal  degrees  of  freedom  for  sta;s;cal  tests  

� May  be  process  “target”  secngs  

�  Used  as  “controls”  in  sequen;al  experiments.  

�  Spaced  out  in  run  order  as  a  check  for  drit.  

Page 121: Statistical Tools for the Quality Control Laboratory and Validation Studies

40

Complete  RandomizaKon:    •  Is  the  cornerstone  of  sta;s;cal  analysis  •  Insures  observa;ons  are  independent    •  Protects  against  “lurking  variables”  •  Requires  a  process    (e.g.,  draw  from  a  hat)  •  May  be  costly/  imprac;cal  

Restricted  RandomizaKon:  •  “Difficult  to  change  factors  (e.g.,  bath  temperature)  are  “batched”  •  Analysis  requires  special  approaches  (split  plot  analysis)  

Blocking:  •  Include  uncontrollable  random  variable  (e.g.,  day)  in  design.  •  Assume  no  interac;on  between  block  variable  and  other  factors  •  Excellent  way  to  reduce  varia;on.  •  Rule  of  thumb:  “Block  when  you  can.  Randomize  when  you  can’t  block”.  

Page 122: Statistical Tools for the Quality Control Laboratory and Validation Studies

41

Page 123: Statistical Tools for the Quality Control Laboratory and Validation Studies

42

Confounding Table I = ABCDE Blk = AB = CDE A = BCDE B = ACDE C = ABDE D = ABCE E = ABCD AC = BDE AD = BCE AE = BCD BC = ADE BD = ACE BE = ACD CD = ABE CE = ABD DE = ABC

Page 124: Statistical Tools for the Quality Control Laboratory and Validation Studies

43

StdOrder  RunOrder  CenterPt  Blocks  Disint  Drug%  Disint%  DrugPS  Lub%  11  1  1  2  A  5  1.0  10  2.0  13  2  1  2  A  5  4.0  10  1.0  19  3  0  2  A  10  2.5  25  1.5  15  4  1  2  A  5  1.0  40  1.0  18  5  1  2  B  15  4.0  40  2.0  14  6  1  2  B  15  4.0  10  1.0  20  7  0  2  B  10  2.5  25  1.5  16  8  1  2  B  15  1.0  40  1.0  17  9  1  2  A  5  4.0  40  2.0  12  10  1  2  B  15  1.0  10  2.0  9  11  0  1  A  10  2.5  25  1.5  7  12  1  1  B  5  4.0  40  1.0  1  13  1  1  B  5  1.0  10  1.0  2  14  1  1  A  15  1.0  10  1.0  4  15  1  1  A  15  4.0  10  2.0  3  16  1  1  B  5  4.0  10  2.0  10  17  0  1  B  10  2.5  25  1.5  5  18  1  1  B  5  1.0  40  2.0  8  19  1  1  A  15  4.0  40  1.0  6  20  1  1  A  15  1.0  40  2.0  

Page 125: Statistical Tools for the Quality Control Laboratory and Validation Studies

44

RunOrder  CenterPt  Blocks  Disint  Drug%  Disint%  DrugPS  Lub%  Disso%  WtRSD  1  1  2  A  5  1.0  10  2.0  100.4  1.6  2  1  2  A  5  4.0  10  1.0  103.0  2.1  3  0  2  A  10  2.5  25  1.5  88.8  1.6  4  1  2  A  5  1.0  40  1.0  94.3  2.3  5  1  2  B  15  4.0  40  2.0  78.9  1.6  6  1  2  B  15  4.0  10  1.0  102.9  2.0  7  0  2  B  10  2.5  25  1.5  90.9  1.4  8  1  2  B  15  1.0  40  1.0  91.8  2.2  9  1  2  A  5  4.0  40  2.0  76.3  1.4  10  1  2  B  15  1.0  10  2.0  103.4  1.6  11  0  1  A  10  2.5  25  1.5  89.9  1.8  12  1  1  B  5  4.0  40  1.0  91.8  2.2  13  1  1  B  5  1.0  10  1.0  101.2  2.2  14  1  1  A  15  1.0  10  1.0  101.8  2.6  15  1  1  A  15  4.0  10  2.0  102.5  1.4  16  1  1  B  5  4.0  10  2.0  100.3  1.5  17  0  1  B  10  2.5  25  1.5  91.2  1.6  18  1  1  B  5  1.0  40  2.0  76.3  1.3  19  1  1  A  15  4.0  40  1.0  92.4  2.1  20  1  1  A  15  1.0  40  2.0  76.8  1.6  

Page 126: Statistical Tools for the Quality Control Laboratory and Validation Studies

45

Page 127: Statistical Tools for the Quality Control Laboratory and Validation Studies

46

Page 128: Statistical Tools for the Quality Control Laboratory and Validation Studies

47

Page 129: Statistical Tools for the Quality Control Laboratory and Validation Studies

48

Page 130: Statistical Tools for the Quality Control Laboratory and Validation Studies

49

Page 131: Statistical Tools for the Quality Control Laboratory and Validation Studies

50

Source DF Adj MS F P Blocks 1 2.21 0.11 0.745 Disint 1 0.30 0.01 0.905 Drug% 1 2.94 0.15 0.707 Disint% 1 0.30 0.01 0.905 DrugPS 1 1174.45 58.93 0.000 Lub% 1 258.61 12.98 0.004 Curvature 1 32.68 1.64 0.225 Res Error 12 19.93

2.179  is  the  1-­‐α/2  th  quan;le  of  the  t-­‐distribu;on  having  12  df.  

Page 132: Statistical Tools for the Quality Control Laboratory and Validation Studies

51

Source DF Adj MS F P Blocks 1 0.01090 0.51 0.487 Disint 1 0.03751 1.77 0.208 Drug% 1 0.00847 0.40 0.539 Disint% 1 0.08282 3.91 0.071 DrugPS 1 0.00189 0.09 0.770 Lub% 1 2.10586 99.46 0.000 Curvature 1 0.21198 10.01 0.008 Res Error 12 0.02117

Page 133: Statistical Tools for the Quality Control Laboratory and Validation Studies

52

Disso%  •  Only  DrugPS  and  Lub%  show  significant  main  effects  •  Plot  of  Disso%  residuals  vs  predicted  Disso%  shows  systema;c  paKern.  

•  The  residual  SD  (4.5)  is  considerably  larger  than  expected  (1.3)  WtRSD  •  Only  Lub%  shows  a  sta;s;cally  significant  main  effect  •  Curvature  is  significant  for  WtRSD  Therefore  •  Only  DrugPS  and  Lub%  need  to  be  considered  further  •  The  other  3  factors  can  fixed  at  nominal  levels.  •  The  predic;on  model  is  inadequate.  Addi;onal  experimenta;on  is  needed.  

Page 134: Statistical Tools for the Quality Control Laboratory and Validation Studies

53

Disso a b Lub% c DrugPS d Lub% DrugPS= + × + × + × × + ε

DrugPS  

Lub%

 

C  

B  

D  

A  

10   40  1  

2  

Trial   DrugPS   Lub%   Disso%  1   10   1   C  2   10   2   A  3   40   1   D  4   40   2   B  

E  

F  

H  G  5   25   1   E  6   25   2   F  7   10   1.5   G  8   40   1.5   H  

2 2Disso a b Lub% c DrugPS d Lub% DrugPS e Lub% f DrugPS= + × + × + × × + × + × + ε

I  

9   25   1.5   I  

Page 135: Statistical Tools for the Quality Control Laboratory and Validation Studies

54

Respon

se  

Factor  

Page 136: Statistical Tools for the Quality Control Laboratory and Validation Studies

55

Factorial or fractional factorial screening design

Response surface design

Page 137: Statistical Tools for the Quality Control Laboratory and Validation Studies

56

Page 138: Statistical Tools for the Quality Control Laboratory and Validation Studies

57

•     “Cube  Oriented”  •       3  or  5  levels  for  each  factor    In  3  factors  

Factorial  or                FracKonal  Factorial                

Central  Composite  Design  

+        +  

=  

         Axial  Points                      Center  Points  

Page 139: Statistical Tools for the Quality Control Laboratory and Validation Studies

58

Page 140: Statistical Tools for the Quality Control Laboratory and Validation Studies

59

Page 141: Statistical Tools for the Quality Control Laboratory and Validation Studies

60

Page 142: Statistical Tools for the Quality Control Laboratory and Validation Studies

61

Std  Run  Center  Block  Disint  Drug%  Disint%  DrugPS  Lub%  Disso%  WtRSD  Order  Order  Point  11  1  1  2  A  5  1.0  10  2.0  100.4  1.6  13  2  1  2  A  5  4.0  10  1.0  103.0  2.1  19  3  0  2  A  10  2.5  25  1.5  88.8  1.6  15  4  1  2  A  5  1.0  40  1.0  94.3  2.3  18  5  1  2  B  15  4.0  40  2.0  78.9  1.6  …  10  17  0  1  B  10  2.5  25  1.5  91.2  1.6  5  18  1  1  B  5  1.0  40  2.0  76.3  1.3  8  19  1  1  A  15  4.0  40  1.0  92.4  2.1  6  20  1  1  A  15  1.0  40  2.0  76.8  1.6  21  21  -­‐1  3  A  10  2.5  10  1.5      22  22  -­‐1  3  A  10  2.5  40  1.5      23  23  -­‐1  3  A  10  2.5  25  1.0      24  24  -­‐1  3  A  10  2.5  25  2.0      25  25  0  3  A  10  2.5  25  1.5      26  26  0  3  A  10  2.5  25  1.5      

Page 143: Statistical Tools for the Quality Control Laboratory and Validation Studies

62

Std  Run  Center  Block  Disint  Drug%  Disint%  DrugPS  Lub%  Disso%  WtRSD  Order  Order  Point  11  1  1  2  A  5  1.0  10  2.0  100.4  1.6  13  2  1  2  A  5  4.0  10  1.0  103.0  2.1  19  3  0  2  A  10  2.5  25  1.5  88.8  1.6  15  4  1  2  A  5  1.0  40  1.0  94.3  2.3  18  5  1  2  B  15  4.0  40  2.0  78.9  1.6  …  10  17  0  1  B  10  2.5  25  1.5  91.2  1.6  5  18  1  1  B  5  1.0  40  2.0  76.3  1.3  8  19  1  1  A  15  4.0  40  1.0  92.4  2.1  6  20  1  1  A  15  1.0  40  2.0  76.8  1.6  21  21  -­‐1  3  A  10  2.5  10  1.5  101.8  1.7  22  22  -­‐1  3  A  10  2.5  40  1.5  84.0  1.7  23  23  -­‐1  3  A  10  2.5  25  1.0  96.7  2.1  24  24  -­‐1  3  A  10  2.5  25  2.0  82.8  1.4  25  25  0  3  A  10  2.5  25  1.5  92.3  1.5  26  26  0  3  A  10  2.5  25  1.5  91.9  1.2  

Page 144: Statistical Tools for the Quality Control Laboratory and Validation Studies

63

Page 145: Statistical Tools for the Quality Control Laboratory and Validation Studies

64

2 2Y a b DrugPS c Lub% d DrugPS e Lub% f Drug PSLub%= + ⋅ + ⋅ + ⋅ + ⋅ + ⋅ ⋅ + ε

Page 146: Statistical Tools for the Quality Control Laboratory and Validation Studies

65

Page 147: Statistical Tools for the Quality Control Laboratory and Validation Studies

66

Page 148: Statistical Tools for the Quality Control Laboratory and Validation Studies

67

Page 149: Statistical Tools for the Quality Control Laboratory and Validation Studies

68

Source DF Adj SS Adj MS F P Blocks 2 2.27 1.13 0.48 0.625 Regression Linear DrugPS 1 1331.87 1331.87 567.73 0.000 Lub% 1 340.61 340.61 145.19 0.000 Square DrugPS*DrugPS 1 27.39 27.39 11.68 0.003 Lub%*Lub% 1 0.14 0.14 0.06 0.811 Interaction DrugPS*Lub% 1 222.98 222.98 95.05 0.000 Residual Error 18 42.23 2.35 Lack-of-Fit 7 25.15 3.59 2.32 0.103 Pure Error 11 17.07 1.55

Page 150: Statistical Tools for the Quality Control Laboratory and Validation Studies

69

Source DF Adj SS Adj MS F P Blocks 2 0.02341 0.01171 0.41 0.671 Regression Linear DrugPS 1 0.00118 0.00118 0.04 0.842 Lub% 1 2.31351 2.31351 80.72 0.000 Square DrugPS*DrugPS 1 0.04980 0.04980 1.74 0.204 Lub%*Lub% 1 0.09743 0.09743 3.40 0.082 Interaction DrugPS*Lub% 1 0.00234 0.00234 0.08 0.778 Residual Error 18 0.51589 0.02866 Lack-of-Fit 7 0.28587 0.04084 1.95 0.154 Pure Error 11 0.23003 0.02091

Page 151: Statistical Tools for the Quality Control Laboratory and Validation Studies

70

StaKsKcal  Significance?  Model  Term   Disso%   WtRSD  

DrugPS   P   P  Lub%   P   P  

DrugPS2   P   P  Lub%2   ?  

DrugPS  ×  Lub%   P   P  Lack  of  Fit  

2 2Y a b DrugPS c Lub% d DrugPS e Lub% f Drug PSLub%= + ⋅ + ⋅ + ⋅ + ⋅ + ⋅ ⋅ + ε?  

Page 152: Statistical Tools for the Quality Control Laboratory and Validation Studies

71

•  The  simplest  model  that  explains  the  data  is  best  (Occam’s  razor,  rule  of  parsimony)  

•  Eliminate  “least  significant”  terms  one  at  a  ;me  followed  by  re-­‐analysis  

•  Always  eliminate  highest  order  terms  first  

•  Don’t  eliminate  lower  order  terms  which  are  contained  in  significant  higher  order  terms  

•  Any  exis;ng  theory  or  prior  knowledge  trumps  these  rules.  

2 2Y a b DrugPS c Lub% d DrugPS e Lub% f Drug PSLub%= + ⋅ + ⋅ + ⋅ + ⋅ + ⋅ ⋅ + ε?  

Page 153: Statistical Tools for the Quality Control Laboratory and Validation Studies

72

Estimated Regression Coefficients for Disso% using data in uncoded units Term Coef Constant 105.321 DrugPS -0.478970 Lub% 6.62343 DrugPS*DrugPS 0.0130426 Lub%*Lub% -0.959956 DrugPS*Lub% -0.497745

S = 1.49153 PRESS = 83.4051 R-Sq = 97.76% R-Sq(pred) = 95.79% R-Sq(adj) = 97.20%

Page 154: Statistical Tools for the Quality Control Laboratory and Validation Studies

73

Estimated Regression Coefficients for WtRSD using data in uncoded units Term Coef Constant 4.66698 DrugPS -0.0293187 Lub% -2.96608 DrugPS*DrugPS 0.000623945 Lub%*Lub% 0.763118 DrugPS*Lub% -0.00161165

S = 0.164211 PRESS = 0.850996 R-Sq = 83.93% R-Sq(pred) = 74.65% R-Sq(adj) = 79.92%

Page 155: Statistical Tools for the Quality Control Laboratory and Validation Studies

74

Acceptable performance more likely

•  Difficult  to  do  with  >  2  factors  •  Does  not  take  into  account    

•  es;ma;on  uncertainty  •  correla;on  among  responses  •  variability  in  control  of  factor  levels  

•  variability  in  the  underlying  true  model  over  ;me  

Page 156: Statistical Tools for the Quality Control Laboratory and Validation Studies

75

Page 157: Statistical Tools for the Quality Control Laboratory and Validation Studies

76

Page 158: Statistical Tools for the Quality Control Laboratory and Validation Studies

77

Global Solution DrugPS = 11.2121 Lub% = 1.93939 Predicted Responses Disso% = 100.002 , desirability = 1.000 WtRSD = 1.500 , desirability = 0.117927 Composite Desirability = 0.343404

Page 159: Statistical Tools for the Quality Control Laboratory and Validation Studies

78

Predicted Response for New Design Points Using Model for Disso% Point Fit SE Fit 95% CI 95% PI 1 100.002 0.621070 (98.7063, 101.297) (96.6316, 103.372) Predicted Response for New Design Points Using Model for WtRSD Point Fit SE Fit 95% CI 95% PI 1 1.49952 0.0683772 (1.35689, 1.64216) (1.12848, 1.87057)

Page 160: Statistical Tools for the Quality Control Laboratory and Validation Studies

79

1.  Number  of  trials  ≥  Number  of  model  coefficients  

2.  Each  coded  column  adds  to  0  (balance)  

3.  Inner  product  of  any  2  coded  columns  =  0  (orthogonality)  

4.  Use  resolu;on  V  (or  at  least  IV)  for  screening  designs  5.  Factor  ranges  are  bold  (but  not  too  bold)  6.  Incorporate  process  knowledge  &  sequen;al  strategies  7.  Assure  adequate  sample  size  (power)  

8.  Randomize  processing  order  

9.  Block  when  you  cannot  randomize  

10.   Incorporate  tests  for  model  adequacy  (e.g.,  center  points)  

11.   Avoid  PARC  (Planning  Ater  Research  is  Complete)  

Page 161: Statistical Tools for the Quality Control Laboratory and Validation Studies

80

1.  Use  graphics  (picture  =  1,000  words)  2.  Always  verify  model  assump;ons  (normality,  independence,  

variance  homogeneity)  

3.  In  model  reduc;on,  follow  rules  of  hierarchy  tempered  by  prior  process  knowledge    

4.  Use  coded  factor  levels  in  judging  sta;s;cal  significance  of  model  coefficients.  

5.  Consider  predic;on  uncertainty  when  iden;fying  op;mal  factor  secngs  

6.  Take  advantage  of  curvature  &  interac;ons  when  choosing  op;mal  factor  secngs  

7.  Always  perform  independent  trials  to  confirm  predic;ons.  

Page 162: Statistical Tools for the Quality Control Laboratory and Validation Studies

81

Minitab  • General  purpose  stat  package  • User  friendly  • Good  learning  tool    JMP  • General  purpose  stat  package  • Excellent  for  DOE  &  SPC  • Very  advanced  features  

• Monte-­‐Carlo  simula;on  of  DOE  models  • Good  D-­‐op;mal  design  features  

• May  need  sta;s;cal  support  for  some  features    Design  Expert  • Exclusive  focus  on  DOE  (may  want  addnl  tools)  • I  have  not  used  but  my  impression  is  very  good  

5

5

10

15

Hard%RSD

MixTime(min)5 7 9 11 13 15MixTime(min)5 7 9

15

20

2.015 17

3.02.5 Water(L)

2.0

3.0

Water(L)

Surface Plot of Hard%RSD

6 11 16

2.0

2.5

3.0

MixTime(min)

Wat

er(L

)

Overlaid Contour Plot of Hardness...Hard%RSD

Hardness

Hard%RSD

19.520.5

07

Lower BoundUpper Bound

White area: feasible region

Page 163: Statistical Tools for the Quality Control Laboratory and Validation Studies

82

Contour  Profiling  and  overlay  for    design  space  idenKficaKon  

Monte-­‐Carlo  SimulaKon  to  determine  effect  of  poor  factor  control  on  future  batch  failure  rate  

67

Page 164: Statistical Tools for the Quality Control Laboratory and Validation Studies

83

• Robust  design  &  Taguchi  designs  • Mixture  (e.g.,gasoline  blend)  and  constrained  designs  

• D-­‐op;mal  designs  and  custom  augmenta;on    

• Bayesian  approaches  • Probability  of  mee;ng  specifica;ons  • mul;ple  correlated  responses  • incorpora;on  of  prior  knowledge  

• Variance  component  analysis  &  Gage  R&R  

• Split-­‐plot  experiments  

Page 165: Statistical Tools for the Quality Control Laboratory and Validation Studies

84

1.  Box, G. E. P.; Hunter, W. G., and Hunter, J. S. (1978). Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building. John Wiley and Sons.

2.  Montgomery D (2005) Design and analysis of experiments, 6th edition, Wiley.

3.  Myers R, Montgomery D, and Anderson-Cook C (2009) Response surface methodology, Wiley.

4.  Diamond W (1981) Practical Experiment Designs, Wadsworth, Belmont CA

5.  Altan S, et al (2010) Statistical Considerations in Design Space Development (Parts I-III) PharmTech Nov 2, 2010. Available on line at http://www.pharmtech.com/pharmtech/author/authorInfo.jsp?id=53118

6.  Conformia CMC-IM Working Group (2008) Pharmaceutical Development case study: “ACE Tablets”. Available from the following web site: http://www.pharmaqbd.com/files/articles/QBD_ACE_Case_History.pdf

7.  ICH Expert Working Group (2008) GUIDELINE on PHARMACEUTICAL DEVELOPMENT Q8(R1) Step 4 version dated 13 November 2008

8.  ICH Expert Working Group (2005) Guideline on QUALITY RISK MANAGEMENT Q9 Step 4 version dated 9 November 2005

9.  FDA CDER/CBER/CVM (November 2008) Draft Guidance for Industry Process Validation: General Principles and Practices (CGMP)

Thank You!!

Page 166: Statistical Tools for the Quality Control Laboratory and Validation Studies

GMPWkPHL1012S4 1

Statistical Control to Monitor and Control Validated Processes.

Session 4

Steven S. Kuwahara, Ph.D. GXP BioTechnology LLC

PMB 506, 1669-2 Hollenbeck Avenue Sunnyvale, CA 94087-5402 USA

Tel. & FAX: (408) 530-9338 e-Mail: [email protected]

Website: www.gxpbiotech.org

Page 167: Statistical Tools for the Quality Control Laboratory and Validation Studies

•  All functions mentioned in the GMP are GMP processes. –  Not all GMP processes need to be monitored. –  The processes that introduce variability into the

product or can cause the production of unacceptable product need to be monitored.

–  This goes beyond manufacturing processes. –  QC test methods need to be monitored. –  The annual product review needs to be monitored to see

if batch records are being properly executed. –  Processes that generate critical business information

need to be monitored.

What to Monitor 1.

2 GMPWkPHL1012S4

Page 168: Statistical Tools for the Quality Control Laboratory and Validation Studies

•  Use information from the process validation and risk assessment to determine what needs to be monitored. (You need proof of your assertion.) –  Even if the process validation shows that a step is very

stable, if the risk assessment shows that that step can create a serious threat to the product or patient, it should be monitored.

–  Steps or processes can be monitored. If there are multiple small steps in a process, the outcome of the process can be monitored in place of each of the steps.

–  If the outcome of a process or step is immaterial to the quality of the product, it does not need to be monitored.

What to Monitor 2.

3 GMPWkPHL1012S4

Page 169: Statistical Tools for the Quality Control Laboratory and Validation Studies

•  There can be two types of data. –  Variable data forms a continuum and an individual

result can fall anywhere within it. Most statistical procedures are designed to deal with variable data.

–  Attribute data comes in discrete units. Yes/no: red, green, blue, or yellow; pass/fail; high, medium, low.

–  One of the problems with attribute data is that it may be subjective or actually part of a continuum. Is a thing red, reddish, pink, or brown, brownish, rust; what is high or medium; passing?

–  Since attribute data must be discrete, a clear and firm definition is needed.

Data 1.

4 GMPWkPHL1012S4

Page 170: Statistical Tools for the Quality Control Laboratory and Validation Studies

•  There are statistical methods for dealing with attribute data, especially since they may not follow the normal distribution. –  One technique is to convert attribute data into variable data by

using fractions or percentages, but this is good only with large numbers. With small numbers the discreteness creates large jumps.

–  Attribute data that is in binary (+/-, yes/no) form will follow a binomial distribution.

–  Attribute data is not as strong as variable data so you need more of it.

•  In what follows we will assume that you have variable data.

Data 2.

5 GMPWkPHL1012S4

Page 171: Statistical Tools for the Quality Control Laboratory and Validation Studies

TOLERANCE vs. CONFIDENCE LIMITS With confidence limits you are trying to determine the value of the “true mean” (µ). The result gives an interval within which you expect to find µ with a certain probability (confidence). The results are means with the same n that was used to calculate the limits. With a tolerance interval you are asking for the values of the next n numbers of results. The result gives an interval within which you have x % confidence that it will contain y % of the results. It assumes that all of the results are from the same population. The result here is a single number. 6 IVTPHL1012S1

Page 172: Statistical Tools for the Quality Control Laboratory and Validation Studies

TOLERANCE LIMITS FOR A SINGLE RESULT 1.

•  DEFINITION: ± ks. Where k is a value to allow a statement that we have 100(1 - α) percent confidence that the limits will contain proportion p of the population.

•  Requires a normal distribution of the population. •  Allows one to set limits for single determinations

(not averages) with a limited number of replicates available.

7 IVTPHL1012S1

Page 173: Statistical Tools for the Quality Control Laboratory and Validation Studies

TOLERANCE LIMITS 2. Selected Portions of a Table for Two-sided Limits

•  k 95% Confidence 99% Confidence •  n p=95% p=99% p=95% p=99% •  2 37.67 48.43 188.5 242.3 •  3 9.916 12.86 22.40 29.06 •  4 6.370 8.299 11.15 14.53 •  5 5.079 6.634 7.855 10.26 •  10 3.379 4.433 4.265 5.594 •  20 2.752 3.615 3.168 4.161 •  50 2.379 3.126 2.576 3.385 •  Table XIV, Applied Statistics and Probability for Engineers, D.C.

Montgomery & G.C. Runger, John Wiley & Sons, 1994.

8 IVTPHL1012S1

Page 174: Statistical Tools for the Quality Control Laboratory and Validation Studies

TOLERANCE LIMITS FOR A SINGLE RESULT 3. Example

•  A product is made at 2.25 mg/mL. n = 10 samples are taken from 5 early lots. The assays show 2.27, 2.25, 2.24, 2.22, 2.26, 2.23, 2.23, 2.24, 2.27, and 2.25 mg/mL. We calculate = 2.246, s = 0.017. From the table at n = 10, 99% confidence for 95% of results k = 4.265 so the Tolerance Interval is:

•  2.246 ± (4.265)0.017 = 2.319 - 2.173. •  We have 99% confidence that 95% of the results from

this population will be in the interval. •  The 95% confidence interval for the mean is ± 0.012,

giving 2.258 to 2.234. 9 IVTPHL1012S1

Page 175: Statistical Tools for the Quality Control Laboratory and Validation Studies

Statistical Process Control (SPC) Basics. 1.

•  At some level of discrimination, NO TWO THINGS ARE ALIKE. There is always some variation.

•  If things are not being hand made, the variation should be small enough to make each unit interchangeable with any other unit. – A 100 mg tablet may vary by ± 1 mg and be

okay, but not a 1 mg tablet. •  The allowable variation should not be larger than

the variation that causes a unit to be “different.”

10 GMPWkPHL1012S4

Page 176: Statistical Tools for the Quality Control Laboratory and Validation Studies

Statistical Process Control (SPC) Basics. 2.

•  In the old days, people made many units of product and the “production lot” underwent 100% inspection to segregate “good” units from the “bad.” “Bad” units were scrapped. Thus the name “Quality Control.” –  It cost just as much to make scrap as to make a “good”

unit. –  100% inspection is never 100% effective. –  “Quality Control” was basically a sorting operation.

Quality was really not controlled since the “bad” product had already been made.

11 GMPWkPHL1012S4

Page 177: Statistical Tools for the Quality Control Laboratory and Validation Studies

Statistical Process Control (SPC) Basics. 3.

•  Walter Shewhart, in the late 1920’s, and, later, W. Edwards Deming basically proposed the idea that making scrap was not cost effective.

•  They proposed the idea that the elimination of scrap would produce the most cost effective manufacturing process.

•  Scrap was the result of excessive variation. •  Variation comes in two forms. In the Shewhart/

Deming terminology there are common cause and assignable cause variations.

12 GMPWkPHL1012S4

Page 178: Statistical Tools for the Quality Control Laboratory and Validation Studies

Statistical Process Control (SPC) Common Cause Variation.

•  Common (controlled) cause variation is the basic variation that is inherent in a process. It is the sum of all of the small, inherent, random errors associated with the process. Sources of error may be undetectable.

•  Since common cause variation is inherent in a process, it changes only when the process, or the way that the process is managed, changes.

•  Common cause variation, therefore, can only be reduced through the actions of management.

13 GMPWkPHL1012S4

Page 179: Statistical Tools for the Quality Control Laboratory and Validation Studies

SPC: Assignable Cause Variation.

•  Assignable (special) cause variations result from unusual situations and are not built-in as a part of the process.

•  Because they arise from unusual events, assignable cause variations should be detectable by careful inspection of the system.

•  While management action may be needed, most assignable cause variations can be corrected by line workers themselves, if they have the required knowledge and experience.

14 GMPWkPHL1012S4

Page 180: Statistical Tools for the Quality Control Laboratory and Validation Studies

X-bar Chart

Time LCL

UCL

X bar x x

x x x x x x x

x x x

x x x x

x

15 GMPWkPHL1012S4

Page 181: Statistical Tools for the Quality Control Laboratory and Validation Studies

R Chart

•  UCL

Time

Range

LCL

o o o o o o o o o o o o

o

o

o o

o o

o o

o o o

16 GMPWkPHL1012S4

Page 182: Statistical Tools for the Quality Control Laboratory and Validation Studies

GMPWkPHL1012S4 17

Page 183: Statistical Tools for the Quality Control Laboratory and Validation Studies

GMPWkPHL1012S4 18

Chart and Process Interpretation

•  Look at R chart first. x bar Chart is meaningless if R has changed significantly or is large.

•  If R has increased, identify any special causes that are responsible for the increased variability and change the process to stop them.

•  If R has decreased, identify special causes responsible and change process to incorporate them.

Page 184: Statistical Tools for the Quality Control Laboratory and Validation Studies

GMPWkPHL1012S4 19

SETTING CONTROL LIMITS FOR YOUR SPC CHARTS. 1.

•  Control limits for the SPC charts should be incorporated as specifications, unless you are setting your release specifications to be wider than your control limits.

•  The problem here is that your SPC control limits should be based upon what your process is capable of delivering, if it is operating in a state of control. Therefore, if the limits are exceeded, it is an indication that the process is out of control, even if it is within the release limits.

Page 185: Statistical Tools for the Quality Control Laboratory and Validation Studies

GMPWkPHL1012S4 20

SETTING CONTROL LIMITS FOR YOUR SPC CHARTS. 2.

•  First, calculate the average of the averages that you are using. This is:

Now calculate the average range that comes from the ranges of the data that were used for the averages. This is:

N

xx

N

∑= 1

R

Page 186: Statistical Tools for the Quality Control Laboratory and Validation Studies

GMPWkPHL1012S4 21

SETTING CONTROL LIMITS FOR YOUR SPC CHARTS. 3.

•  For “3-sigma limits” the Shewhart formulae are:

7. size subgroupfor 0D that Notenegative.not if

CL

CL

3

3

R4

2

x2

<=

=

==

−=

=+=

RDLCL

RRDUCL

RAxLCL

xRAxUCL

R

R

x

x

Page 187: Statistical Tools for the Quality Control Laboratory and Validation Studies

GMPWkPHL1012S4 22

SETTING CONTROL LIMITS FOR YOUR SPC CHARTS. 4.

•  The constants, A2, D3, and D4 are combinations of other constants and are set for 3σ and change with the subgroup size.

•  The subgroup size (n) is the number of replicates for each average. (Also known as a “rational subgroup.”)

•  When possible, 20 to 30 subgroups should be used to establish the grand average and the average range. Fewer subgroups may be used in the beginning, but the numbers should evolve until 30 is reached.

•  The idea is to reach the point where you are dealing with µ and σ, but recent computer studies have shown that instead of 30, the real number should be around 200.

Page 188: Statistical Tools for the Quality Control Laboratory and Validation Studies

Western Electric Rules 1. Zones

GMPWkPHL1012S4 23

+1σ

-1σ -2σ

-3σ

+2σ

+3σ

Page 189: Statistical Tools for the Quality Control Laboratory and Validation Studies

Western Electric Rules 2. •  The process is out of control when any one of the

following happens: •  1. One point plots outside the 3σ control limits. •  2. Two out of 3 consecutive points are beyond the 2σ limits. •  3. Four out of 5 consecutive points plot beyond the 1σ

control limits. •  4. Eight consecutive points plot on one side of the center

line (). •  These are one-sided rules. They only apply to

events happening on one side of the chart. –  If a point is beyond +2σ and the next point plots beyond

-2σ, they are not 2 points beyond 2σ.

GMPWkPHL1012S4 24

Page 190: Statistical Tools for the Quality Control Laboratory and Validation Studies

Western Electric Rules 3. Considerations

•  Based on the work of Shewhart, Deming, and others when they were with Bell Telephone.

•  Because they are based on the standard deviation zones, these are sometimes known as “zone rules.”

•  These rules enhance the sensitivity of the control charts, but require the use of standard deviations, not ranges.

•  A “point” on a control chart usually represents an average that was calculated for a “rational group.”

GMPWkPHL1012S4 25

Page 191: Statistical Tools for the Quality Control Laboratory and Validation Studies

Rational Subgroups •  The subgroup that is chosen to represent a

“point” should be chosen so that it properly represents the product unit. –  If you are checking lots over time then the subgroup

units should be randomly chosen from the lot. –  If you are more interested in the different parts of a

lot, you should define the parts and choose from within the “part.”

–  The number of units in the subgroup should be calculated as an “n” value based on your desired confidence interval and the level of risk that you are willing to accept.

GMPWkPHL1012S4 26

Page 192: Statistical Tools for the Quality Control Laboratory and Validation Studies

CUSUM Charts I.

•  There are CUmulative SUM control charts. –  They are based on the Cumulative Sum of the

Deviations from a target value. •  The CUSUM chart is a relatively new

invention designed to overcome the lack of sensitivity of “Shewhart Charts” and the tendency of the charts to give false alarms, especially when the Western Electric Rules are employed.

•  CuSum chart: A quality control chart with memory

GMPWkPHL1012S4 27

Page 193: Statistical Tools for the Quality Control Laboratory and Validation Studies

CUSUM Charts II.

•  For deviations greater than ± 2.5σ the Shewhart chart is as good or may be better than the CUSUM chart.

•  CUSUM charts are more sensitive and will detect changes (especially trends) in the ± 0.5σ - ± 2.5σ range.

GMPWkPHL1012S4 28

Page 194: Statistical Tools for the Quality Control Laboratory and Validation Studies

Types of ‘Time-weighted charts

•  There are 3 types of time-weighted charts. •  Moving Average •  - Chart of un-weighted moving averages •  Exponential Moving Average (EWMA) •  - Chart of exponentially weighted moving averages •  CUSUM •  - Chart of the cumulative sum of the deviations

from a nominal specification

GMPWkPHL1012S4 29

Page 195: Statistical Tools for the Quality Control Laboratory and Validation Studies

Equations for CUSUM

GMPWkPHL1012S4 30

( )

( )

.deviations thesum you will over which period time theis j

lue. target va theis and jat point CUSUM theis j.at average the

x whereaverages,for S

j.at n observatio theis

x wheredata,point singlefor

jj

j

µ

µ

µ

j

j

ij

j

ijj

Sis

x

xS

−=

−=

Page 196: Statistical Tools for the Quality Control Laboratory and Validation Studies

Starting a CUSUM I.

•  Specification: Target Value: 100 = µ •  n xj Calculation CuSum value = Sj j = 8 •  1 99 = 99 – 100 = -1 •  2 101 = 99+101-2*100 = 0 •  3 99 = 99+101+99-3*100 = -1 •  4 100 =99+101+99+100-4*100 = -1 •  5 102 = 99+101+99+100+102-5*100 = 1 •  6 101 = 99+101+99+100+102+101-6*100 = 2 •  7 100 = 99+101+99+100+102+101+100-7*100 = 2 •  8 101 = 99+101+99+100+102+101+100+101-8*100 = 3

•  Sj= (Σxj ) – jµ

GMPWkPHL1012S4 31

Page 197: Statistical Tools for the Quality Control Laboratory and Validation Studies

Starting a CUSUM II.

•  In theory, the sum of the deviations should be zero, if the process is in statistical control.

•  So the cumulative run length (j) should be large enough that you would expect it to be at zero, if the target value (µ) is really the center point.

•  The target value (µ) should be the specification for the measurement.

•  Therefore the center line in a CUSUM chart should be at zero.

GMPWkPHL1012S4 32

Page 198: Statistical Tools for the Quality Control Laboratory and Validation Studies

Shift then return to Normal

GMPWkPHL1012S4 33

Theory Reality

Page 199: Statistical Tools for the Quality Control Laboratory and Validation Studies

The V-Mask I. • A statistical t-test to determine if the process

is in a state of statistical control. The V-mask is placed at a distance d from the

last point of measurement. The opening of the V-mask is drawn at an

angle of ± θ. •  First, obtain an estimate of the standard

deviation or standard error (σx) for the value of Sj. This should be known from the specification setting process or from development work.

GMPWkPHL1012S4 34

Page 200: Statistical Tools for the Quality Control Laboratory and Validation Studies

The V-Mask II. •  Decide on the smallest deviation you want to detect (D).

Calculate: δ = D/σx. If D = σx then δ = 1

•  Decide on the probability level (α) at which you wish to make decisions. –  For the usual ± 3σ level, α = 0.00135.

•  Determine the scale factor (k) which is the value of the statistic to be plotted (vertical scale) per unit change in the horizontal scale (lot or sample number). It is suggested that k should be between 1σx and 2σx, preferably closer to 2σx.

•  Using the value of δ, obtain the value of the lead time (d) from the following table.

GMPWkPHL1012S4 35

Page 201: Statistical Tools for the Quality Control Laboratory and Validation Studies

The V-Mask III. •  Obtain the angle (ϴ) from the same table by

setting δ = D/k. (Table BB in Juran’s Book) •  Construct the V-Mask from these data.

•  Truncated Table BB.

GMPWkPHL1012S4 36

δ ϴ d 0.2 5o 43’ 330.4 0.5 14o 00’ 52.9 1.0 26o 34’ 13.2 1.8 41o 59’ 4.1 2.0 2.6 3.0

45o 00’ 52o 26’ 56o 19’

3.3 2.0 1.5

Page 202: Statistical Tools for the Quality Control Laboratory and Validation Studies

V-Masked CUSUM

GMPWkPHL1012S4 37

Action?

Failure

d ϴ

Time

X-bar

Page 203: Statistical Tools for the Quality Control Laboratory and Validation Studies

CUSUM Rules •  The sample size can be the same as used in an chart. One

recommendation is to use. •  n = 2.25s2/D where s2 is an estimate of the process (lot to

lot) variance. •  The V-Mask is always placed at the last point measured. •  If all points are within the V-mask, the process is under

control. •  Any point out of the V-Mask shows a lack of control. •  The first point out of the V-mask shows when the shift

started even if later points are within the masks.

GMPWkPHL1012S4 38

Page 204: Statistical Tools for the Quality Control Laboratory and Validation Studies

Moving V-Mask

GMPWkPHL1012S4 39

Failure

Page 205: Statistical Tools for the Quality Control Laboratory and Validation Studies

References: ISO/TR 7871:1997: Cumulative sum charts --

Guidance on quality control and data analysis using CUSUM techniques

•  http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=14804

•  British Standards No. BS 5700 ff SR, 05.03.2009, page 16.

•  Wadsworth, H.M.; “Statistical Process Control,” in Juran’s Quality Handbook 5th Ed.; Juran, J.M. and Godfrey, A.B. (eds.); McGraw-Hill, New York, NY, 1999, Sect. 45, page 45-17. GMPWkPHL1012S4 40