12
One-Factor Analysis of Variance Motivation: One-Factor ANOVA Scenario: Question: Answer: Takeaway: One-Factor Analysis of Variance (ANOVA) One-Factor Analysis of Variance (ANOVA): Between Group Variation: Within Group Variation:

One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Ø F-Distribution

ØTypes!of!Variation

ØOne-Factor!ANOVA

ØANOVA!Table

ØMultiple!Comparisons

One-Factor Analysis of Variance

Lecture!16

Sections!20.6�20.8

Motivation: One-Factor ANOVA

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?

• Answer: From!the!side-by-side!boxplots:• Means!appear!to!be!__________________________

• But!there!is!a!great!deal!of!_________________________________________

• Sample!size!is!___________________

• Takeaway: Need!an!inferential!technique!that!can!compare!3!means!simultaneously

One-Factor Analysis of Variance (ANOVA)

• One-Factor Analysis of Variance (ANOVA): statistical!technique!used!to!compare!the!means!of!three!or!more!populations

• Uses!two!sources!of!variability!to!compare!means

• Between Group Variation: measures!that!amount!of!variability!between!the!sample!means!of!individual!groups

• �How!different!are!the!sample!means!from!one!another?�

• Within Group Variation: measures!the!amount!of!variability!that!exists!within!the!samples

• �How!different!are!the!individual!observations!from!one!another!within!each!group?�

Page 2: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Comparing Types of Variation

Small!Between!Group!Variation• Means!_____________________!(___,!___,!and!___)

Large!Within!Group!Variation• Observations!within!groups!_______________

• Range!for!each!sample!is!about!____

Large!Between!Group• Means!____________________!(___,!___,!and!___)

Small!Within!Group• Observations!within!groups!________________________

• Range!for!each!sample!is!about!____

One-Factor ANOVA: Hypotheses and Conditions

• Hypotheses: Let!! be!the!number!of!groups!being!compared• "#: $% = $& = ' = $(• "): At!least!two!means!are!not!equal

• Assumptions and Conditions:• Independence: Both!the!groups!being!compared!and!the!individuals!sampled!must!be!independent!of!one!another

• Randomization: Subjects!come!from!a!random!sample

• Equal Variance: Variances!of!the!populations!from!which!the!samples!have!been!drawn!are!approximately!equal

• Nearly Normal: Distribution!of!all!sample!means!are!approximately!normal

Example: One-Factor ANOVA

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?

• Hypotheses:• "#: ________________________

• "): ______________________________________________________________________________

Page 3: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Example: One-Factor ANOVA

Example: One-Factor ANOVA

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?

• Conditions:• Independence: Groups!are!__________________!unless!a!student!is!__________!____________;!while!at!the!__________________,!students!are!likely!independent

• Randomization: Subjects!randomly!sampled!______________________________

• Equal Variance: Boxplots!show!_____________________________________;!largest!variance!is!______________________________________________________________________

• Nearly Normal: All!boxplots!look!_________________!so!distribution!of!all!____________________!are!________________________________

Grand Mean

• Grand Mean: the!mean!of!all!observations,!disregarding!the!group!from!which!the!observations!were!sampled

*+ =,% -+% . ,& -+& .'. ,( -+(

,% . ,& .'. ,(

where!-+/ is!the!mean!of!the!observations!from!group!0 and!,/ is!

the!number!of!observations!sampled!from!group!0• Used!in!the!calculation!of!the!between!group!variation!because!it!helps!us!understand!how!different!the!sample!means!are.

Warning: Do not average the sample means! This tactic to find

the grand mean only works if all of the sample sizes are the same.

Page 4: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

One-Factor ANOVA: Types of Variation

• Between Group Variation: How!different!are!the!sample!means?

1123 = ,% -+% 4 *+ & . ,& -+& 4 *+ & .'. ,( -+( 4 *+ &

• Within Group Variation: How!different!are!the!observations!within!each!group?

115 = ,% 4 6 7%& . ,& 4 6 7&

& .'. ,( 4 6 7(&

Sample!

SizeSample!

Variance

Sample!

Size

Sample!

Mean

Example: One-Factor ANOVA

• Grand Mean:

*+ = _____________________________________!= __________

• Between Group Variation:

1123 = __________________________________________________________________

= _______________________________________

= _____________

• Within Group Variation:

115 = ___________________________________________________________________

= ________________________________________

= ______________

F-Distribution and Test Statistic

• F-Distribution: continuous!probability!distribution!that!has!the!following!properties:

• Unimodal!and!right-skewed

• Always!non-negative

• Two!parameters!for!degrees!of!freedom• One!for!numerator!and!one!for!denominator

• Used!to!compare!the!ratio!of!two!sources!of!variability

• Test Statistic:

8(9 ;<(<% =>123

>15=

1123?@! 4 6A

115?@, 4 ! 4 6A

where!! is!the!number!of!categories!and!, is!the!total!sample!size

Between!Group

Within!Group

Page 5: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Example: One-Factor ANOVA

• Degrees of Freedom:• Numerator: ________________________

• Denominator: _______________________

• Mean Squared Treatment:

>123 = _________________ = ____________

• Mean Squared Error:

>15 = _________________ = ____________

Example: One-Factor ANOVA

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?

• Mechanics:• Test Statistic:

8 = _______________ = _________

• Degrees of Freedom: ________________

• P-Value: ________!(Using!software)

______

______

Example: One-Factor ANOVA

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?

• Conclusion: With!a!p-value!of!______,!we!_____________________________!and!conclude!that!the!mean!math!SAT!scores!are!___________________!___________________________________________________________________________.

Limitation: One-Factor ANOVA can only determine if a significant

difference between two means exists – not where that difference exists.

Page 6: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

ANOVA Table

• ANOVA Table: summary!of!the!sums!of!squares,!degrees!of!freedom,!and!mean!squared!terms!from!an!ANOVA

Source Sums of Squares DF Mean Squares Test Statistic

Between Group 1123 ! 4 6 >123 =1123

! 4 68 =

>123

>15

Within Group 115 , 4 ! >15 =115

, 4 !

Total 112 , 4 6

Note 1: Between group and within group sums of squares sum to total sums of squares

Note 2: Degrees of freedom in numerator and denominator sum to total degrees of freedom

Example: One-Factor ANOVA

• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!Randomly!sample!72!customers!and!record!the!total!amount!of!their!purchase!before!the!coupon.

• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?

• Hypotheses:• "#: ____________________________________________

• "): _______________________________________________________________________________

Using Excel

Page 7: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Example: One-Factor ANOVA

• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!Randomly!sample!72!customers!and!record!the!total!amount!of!their!purchase!before!the!coupon.

• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?

• Task: Complete!the!ANOVA!table

Source Sums of Squares DF Mean Squares Test Statistic

Between Group 4627 3 1543 1.414

Within Group 74,206 68 1091

Total 78,833 71

Using Excel

Critical value assuming

5% level of significance

Example: One-Factor ANOVA

• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!Randomly!sample!72!customers!and!record!the!total!amount!of!their!purchase!before!the!coupon.

• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?

• Mechanics:• Degrees of Freedom: _________________

• Test Statistic: _______________

• P-Value: __________

Page 8: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Example: One-Factor ANOVA

• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!Randomly!sample!72!customers!and!record!the!total!amount!of!their!purchase!before!the!coupon.

• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?

• Conclusion: With!a!p-value!of!________,!we!___________________________!_____________!and!conclude!that!the!amount!of!money!people!spend!at!this!department!store!_______________________________________________!_____________________________________.

Drawback of ANOVA

•When!a!significant!difference!is!found!the!only!conclusion!that!can!be!drawn!at!this!point!is!that!at!least!two!means!are!not!equal.

• Problem:Many!different!ways!of!rejecting!"#• $% B $&,!$% = $C,!$& = $C• $% = $&,!$% B $C,!$& = $C• $% = $&,!$% = $C,!$& B $C• $% B $&,!$% B $C,!$& = $C• $% B $&,!$% = $C,!$& B $C• $% = $&,!$% B $C,!$& B $C• $% B $& B $C

• Solution: ______________________________!will!tell!us!which!of!these!scenarios!is!true

___________!of!means!

not!equal

____________!of!means!

not!equal

____________!are!equal

Number of possibilities

increases exponentially as

the number of groups

being compared increases

Multiple Comparisons

• Multiple Comparisons: procedure!used!to!determine!exactly!which!pairs!of!means!are!significantly!different

• Extension!of!ANOVA

• Calculate!a!confidence!interval!for!each!pair!of!means,!but�

• �makes!adjustment!to!the!confidence!interval!based!on!how!many!comparisons!need!to!be!made

• Many!different!techniques• Fisher�s!Least!Significant!Difference!Method

• Bonferroni!Adjustment!Method

• Tukey�s!Multiple!Comparison!Method

Note: Your textbook mentions the Bonferroni adjustment

method in passing but does a poor job on elaborating. Do

not rely on the textbook for notes on multiple comparisons.

Page 9: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Fisher’s Least Significant Difference (LSD) Method

• Fisher’s LSD Method: for!each!possible!pairing!of!means,!calculate!the!following!confidence!interval:

DE/ 4 DEF ± G;<( >156

,/.6

,F

• Interval!does!not!contain!zero:!Conclude means!are!significantly!different

• Interval!contains!zero: Conclude!means!not!significantly!different

Sample!sizes!of!groups!

being!compared

Mean!squared!

error!from!ANOVA

Denominator!df

from!ANOVA

Difference of Two Means

G;<&7%&

,%.7&&

,&

• More!degrees!of!freedom!leads!to!smaller!multiplier

• Weighted!sum!of!two!variances!

Fisher’s LSD Method

G;<( >156

,/.6

,F

• Fewer!degrees!of!freedom!leads!to!larger!multiplier

• MSE!is!combination!of!3!or!more!variances

Difference of Two Means vs. Multiple Comparisons

Takeaway: Because G;<& H G;<( and IJK

;J.

IKK

;KH >15

%

;L.

%

;M, the margin of

error for Fisher’s LSD Method will always be wider than doing a confidence

interval for the difference of two means.

Review: SAT ANOVA Example

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?

• Conclusion: Strong!evidence!at!least!two!means!differed

• Statistics:

Source SS DF MS F

Between 85,120 2 42,560 8.44

Within 60,520 12 5043

Total 145,640 14

Group Mean Sample Size

Comp. Sci. 720 5

Economics 640 5

History 536 5

ANOVA Table:

Page 10: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Example: Multiple Comparisons on SAT Data

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: How!many!confidence!intervals!do!we!need!to!calculate?

• Answer: ____• ________________________________________

• ________________________________________

• ________________________________________

Note: As seen before, this number will grow exponentially as the

number of groups being compared increases. Use software to find

these confidence intervals.

Example: Multiple Comparisons on SAT Data

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: What!is!the!margin!of!error!for!the!confidence!intervals?

• Answer:• t-Statistic: G%& = ____________

• Mean Squared Error: _________

• Sample Sizes: ____________________________

• Margin of Error:

_____________________________________________

Important Note: The margins of

error will only be the same if the

sample sizes taken from each

group are ___________. Otherwise,

each confidence interval will

have its own _____________________

Example: Multiple Comparisons on SAT Data

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: What!are!the!confidence!intervals!for!the!difference!in!means!between!each!pair?

• Answer:• CS vs. Economics: _______________________________!=!_____________________

• CI!__________________________

• CS vs. History: _______________________________!=!_____________________

• CI!__________________________

• Economics vs. History: _______________________________!=!_____________________

• CI!__________________________

Page 11: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Example: Multiple Comparisons on SAT Data

• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history

• Question: What!conclusions!can!be!drawn!about!the!mean!math!SAT!scores!of!the!three!majors

• Answer:• Both!____________________!and!_____________!majors!score!significantly!higher!than!__________!majors!on!the!math!portion!of!the!SAT.

• While!__________________________!majors!had!a!larger!sample!mean!than!_____________!majors,!the!difference!between!them!was!______________________

Review: Coupon ANOVA Example

• Scenario: Comparing!average!amount!spent!for!coupon!discounts!of!15%,!20%,!25%,!and!30%!off!entire!purchase

• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?

• Conclusion: Little!to!no!evidence!of!a!significant!difference

• Statistics:

Source SS DF MS F

Between 4,627 3 1,543 1.414

Within 74,206 68 1091

Total 78,833 71

Group Mean Sample Size

15% Off 149.56 25

20% Off 140.64 21

25% Off 142.71 16

30% Off 165.30 10

ANOVA Table:

Example: Mult. Comp. When Failing to Reject ANOVA

• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!After!doing!the!ANOVA,!we!concluded!there!was!no!evidence!any!of!the!mean!amounts!differed

• Question: Without!calculating!out!the!multiple!comparisons!confidence!intervals,!what!do!we!know!about!all!of!them?

• Answer: _____________________________• ANOVA!allowed!us!to!conclude!________________________________________

• Takeaway: If!we!fail!to!reject!the!null!hypothesis!in!ANOVA,!there!is!_________________________________________________________________________

Page 12: One-Factor Analysis of Variance...ØF-Distribution ØTypes!of!Variation ØOne-Factor!ANOVA ØANOVA!Table ØMultiple!Comparisons One-Factor Analysis of Variance Lecture!16 Sections!20.6

Using Excel

Formulas for 15% vs. 20% in cells C13, D13, and

E13. Change the rows to get confidence intervals

for other groups

Example: Group Sample Sizes

• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase

• Question: What!relationship!exists!with!the!widths!of!the!intervals?

• Answer: Confidence!intervals!comparing!groups!with!_______________________________________________________

Groups Sample

Sizes

Interval

Width

15% vs. 20% 25 and 21 39.025

15% vs. 25% 25 and 16 42.209

15% vs. 30% 25 and 10 49.329

20% vs. 25% 21 and 16 43.749

20% vs. 30% 21 and 10 50.654

25% vs. 30% 16 and 10 53.145