22
Two-Way Analysis of Variance (TWO-WAY ANOVA) Presented By: Richa Rai Kinshuk Sharma Moiz Ashraf Ansari Geeta Verma Shruti Godra

ANOVA Presentation

Embed Size (px)

Citation preview

Page 1: ANOVA Presentation

Two-Way Analysis of Variance (TWO-WAY ANOVA)

Presented By: Richa Rai

Kinshuk SharmaMoiz Ashraf AnsariGeeta VermaShruti Godra

Page 2: ANOVA Presentation

2

OverviewWhat is ANOVA?When is it useful?How does it work?Some ExamplesLimitationsConclusions

Page 3: ANOVA Presentation

3

Definitions ANOVA: analysis of variation in

an experimental outcome and especially of a statistical variance in order to determine the contributions of given factors or variables to the variance.

Remember: Variance: the square of the standard deviation

Remember: RA Fischer, 1919-Evolutionary Biology

Page 4: ANOVA Presentation

4

IntroductionAny data set has variability

Variability exists within groups… and between groups

Question that ANOVA allows us to answer : Is this variability significant, or merely by chance?

Page 5: ANOVA Presentation

5

The difference between variation within a group and variation between groups may help us determine this. If both are equal it is likely that it is due to chance and not significant.

H0: Variability w/i groups = variability b/t groups, this means that µ1 = µn

Ha: Variability w/i groups does not = variability b/t groups, or, µ1 ≠ µn

Page 6: ANOVA Presentation

6

Two-Way ANOVATwo factor (manipulated variable)

Analysis of variance can be used to test for the effects of two factors.

Two or more groups to compare

Page 7: ANOVA Presentation

Example

A tea company can analyse its sale on the basis of four salesmen and three seasons winter ,summer and rainy season .

Educationalists can analysis three method of teaching a subject in 4 different ways.

7

Page 8: ANOVA Presentation

Procedure of Two Way ANOVA1. Take total of the values of individual item in all the sample.

2. Correction factor= T2 /N

3. Total sum of squares: SST= [ΣX2

1 + ΣX22 +.... ΣX2

k] - (T2 /N)

4. Some of squares between the column: SSC = Σ{(ΣXc)2 /nc} - T2 /N

5. Some of squares between the row: SSR = Σ{(ΣXᵣ)2 /nᵣ} - T2 /N

6. Some of square for residual:SSE = SST- (SSC+SSR)

8

Page 9: ANOVA Presentation

7. Determination of degree of freedomThe total no. of item = crThe total number of degree of freedom= cr-1Number of degree of freedom columns= c-1number of degree of freedom of rows= r-1Number of degree of freedom for residual: = (c-1)(r-1)8. Calculation of variance ratio or F coefficient: F- coefficient for variance between columns=

Variance between columns (MSC) / Residual variance (MSE)

F- coefficient for variance between rows: = Variance between columns (MSR) /

Residual variance (MSE) 9

Page 10: ANOVA Presentation

9. Preparation of Two-Way ANOVA Table

Source of variation

Sum of square (SS)

Degree of freedom (d.

f.)

Mean Square (M.S)

Variance ratio

Between columns

SSC = Σ{(ΣXc)2

/nC} - T2 /N

c-1 SSC/(c-1) = MSC

F = MSC/MSE

Between rows SSR = Σ{(ΣXᵣ)2

/nᵣ} - T2 /N

r-1 SSR/ (r-1) = MSR

F = MSR/MSE

Residual SSE = SST-

(SSC+SSR)

(c-1) (r-1) SSE/{(c-1) (r-1)} = MSE

Total SST= [ΣX21 + ΣX22 +....

ΣX2k] -

(T2 /N)

cr-1

Page 11: ANOVA Presentation

Example of Two-Way ANOVA

Set up the ANOVA table for the following information relating to three drugs testing to judge the effectiveness in reducing blood pressure for three different groups of people. Group of People Drug

X Y Z

A 14 10 1115 9 11

B 12 7 1011 8 11

C 10 11 811 11 7

1.Do the drugs act differently ?2.Are the different groups of people affected differently ?3.Is the interaction term significant ?4.Answer the above questions taking a significant level of 5 %.

Page 12: ANOVA Presentation

Solution:

Step 1. T = 187, n = 18, thus the correction factor = 187 X 187/ 18 = 1942.72

Step 2. Total SS = (∑ X12 + ∑X2

2+…….+ ∑X52) – C.F

= 76.28

Step 3. SS Between the columns (that is between the drugs) =

SSC = [(∑C12/ n (C1) + (∑C2

2/ n (C2) +………… + (∑C52/ n (C5)] –

C.F. = 28.77

Step 4. SS Between the rows (that is between people) SSR = Σ{(ΣXᵣ)2 /nᵣ} - T2 /N = 14.78

Page 13: ANOVA Presentation

Step 4. SS Between the rows (that is between people)

SSR = Σ{(ΣXᵣ)2 /nᵣ} - T2 /N= 14.78

Step 5. SS within samples

∑(X1- Xbar²) + ∑(X2-Xbar²)² + …. = 3.50

Step 6. SS for interaction variation

SSI = SST- (SSC + SSR + SSE) = 29.23

Page 14: ANOVA Presentation

Source of variation

SS d. f. MS F-ratio 5 % F-limit

Between columns (between

drugs)

28.77 (3-1) = 2 28.77/2 = 14.385

14.385/0.389 =36.9

F (2, 9) = 4.26

Between rows

(between people)

14.78 (3-1) = 2 14.78/2 = 7.390

7.390/0.389 = 19.0

F (2, 9) = 4.26

Interaction 29.23 4 29.23/4 7.308/0.389

F (4, 9) = 3.63

Within samples (Error)

3.50 (18-9) = 9 3.50/9 = 0.389

Total 76.28 (18-1) = 17

Two-Factor Annova Table

Page 15: ANOVA Presentation

Conclusion:

The above table shows that all the three F-ratios are significant of 5 % level which means that the drugs act differently, different groups of people are affected differently and the interaction term is significant.

Page 16: ANOVA Presentation

The following table shows the number of items produced by five different workers in a factory on five different machines:

Workers Machine NumberM1 M2 M3 M4 M5

A 10 14 23 18 20B 11 15 24 17 21C 9 12 20 16 19D 8 13 17 17 20E 12 15 19 15 22

Discuss, with the help of analysis of variance, whether the mean productivity of (a) workers and (b) machines differs significantly.

Another example using ANOVA table

Page 17: ANOVA Presentation

Solution:Let us take the hypothesis that(a)the mean productivity is the same for the five different machines.(b) the five workers do not differ with regard to mean productivity.This question is based on two –way classification and two-way analysis of variance is clearly asked in the question.

In order to simplify calculations, we code the data by subtracting 16 from each figure. The data in the coded form are given below:

Page 18: ANOVA Presentation

Worker Machine Number Total

M1 M2 M3 M4 M5

A -6 -2 7 2 4 5 ∑r1

B -5 -1 8 1 5 8 ∑r2

C -7 -4 4 0 3 -4 ∑r3

D -8 -3 1 1 4 -5 ∑r4

E -4 -1 3 -1 6 3 ∑r5

Total -30∑C1

-11∑C2

23∑C3

3∑C4

22∑C5

7

Worker

Machine Number Total

M1 M2 M3 M4 M5

A 36 4 49 4 16 109

B 25 1 64 1 25 116

C 49 16 16 0 9 90

D 64 9 1 1 16 91

E 16 1 9 1 36 63

Total 190 31 139 7 102 469

(1) Correction factor (CF) = Total2 / N = (7)2 /25=1.96

(2) Total Sum of Squares of Deviations:SST= (∑ X1

2 + ∑X22+…….+ ∑X5

2) – C.F. = 109+ 116 + 90 + 91 + 63 – 1.96 = 467.04

Calculation of Squares of Deviations Between Columns (Machines) and Rows (Workers)

Coded Data Squares

Page 19: ANOVA Presentation

(3) Sum of Squares between Machines: S.S. between Columns:S.S.C. = [(∑C1

2/ n (C1) + (∑C22/ n (C2) +………… + (∑C5

2/ n (C5)] – C.F. = (-30)2/5 + (-11)2/5 + (23)2/5 + (3)2/5 + (22)2/5 – (1.96) = 900/5 + 121/5 + 529/5 + 9/5 + 484/5 -1.96 = 2043/5 - 1.96 = 408.6 – 1.96 = 406.64

4) Sum of Squares between Workers: S.S. between Rows:S.S.R. = [(∑r1

2/ n (r1) + (∑r22/ n (r2) +………… + (∑r5

2/ n (r5)] – C.F. = (5)2/5 + (8)2/5 + (-4)2/5 + (-5)2/5 + (3)2/5 – (1.96) = 25/5 + 64/5 + 16/5 + 25/5 + 9/5 -1.96 = 139/5 - 1.96 = 27.8 – 1.96 = 25.84

(5) Residual Sum of Squares :S.S.E. = S.S.T. – (S.S.C. + S.S.R.) = 467.04- [406.64 + 25.84] = 467.04- 432.48 = 34.56

Worker Machine Number Total

M1 M2 M3 M4 M5

A -6 -2 7 2 4 5 ∑r1

B -5 -1 8 1 5 8 ∑r2

C -7 -4 4 0 3 -4 ∑r3

D -8 -3 1 1 4 -5 ∑r4

E -4 -1 3 -1 6 3 ∑r5

Total -30∑C1

-11∑C2

23∑C3

3∑C4

22∑C5

7

Coded Data

Page 20: ANOVA Presentation

(Source of Variance(i)

Sum of Squares (ii)

Degrees of Freedom (iii)

Mean Squares (ii)÷(iii)=(iv)

Variance Ratios (v)

Between Machines

SSC= 406.64 c-1=5-1=4 MSC=101.66 F= 101.66/2.16 =47.06

Between workers

SSR= 25.84 r-1=5-1=4 MSR=6.46 F=6.46/2.16 =2.99

Residual error SSE=34.56 (c-1)(r-1)=16 MSE=2.16Total SST=467.04 Cr-1=24

Two-factor ANOVA Table

Two –factor ANOVA table

For v1=4 and v2=16, F0.05= 3.01Conclusion: (i) The calculated value of F between machines i.e. between columns is 47.06 while the corresponding table value is 3.01. Hence, the calculated value of F is greater than table value of F. The hypothesis is rejected. The mean productivity does not appear to be the same for different machines.

(ii) The calculated value of F between workers i.e. between rows is 2.99 while the corresponding table is 3.01. The calculated value of F is less than the table value of F. The hypothesis is accepted Hence, all the five workers do not differ with regard to mean productivity.

Page 21: ANOVA Presentation

Application

It is useful technique concerning researches in the following fields:

• Economics• Biology• Education• Psychology• Sociology• Business/industry, etc.

References:

Research methodology, C. R. Kothari,New age International(P) Limited,publisherStatistical Analysis,Dr. K.C.Goel, Swati Prakashan

Page 22: ANOVA Presentation

Thank You