Upload
lyle-beck
View
64
Download
0
Embed Size (px)
DESCRIPTION
統計學 Spring 2004. 授課教師:統計系余清祥 日期:2004年3月23日 第六週:配適度與獨立性檢定. Chapter 12 Tests of Goodness of Fit and Independence. Goodness of Fit Test: A Multinomial Population Tests of Independence: Contingency Tables - PowerPoint PPT Presentation
Citation preview
1 1 Slide Slide
統計學 Spring 2004
授課教師:統計系余清祥 日期: 2004 年 3 月 23 日 第六週:配適度與獨立性檢定
2 2 Slide Slide
Chapter 12Chapter 12 Tests of Goodness of Fit and Tests of Goodness of Fit and
IndependenceIndependence Goodness of Fit Test: A Multinomial Goodness of Fit Test: A Multinomial
Population Population Tests of Independence: Contingency TablesTests of Independence: Contingency Tables Goodness of Fit Test: Poisson and Normal Goodness of Fit Test: Poisson and Normal
DistributionsDistributions
3 3 Slide Slide
Goodness of Fit Test:Goodness of Fit Test:A Multinomial PopulationA Multinomial Population
1. 1. Set up the null and alternative hypotheses.Set up the null and alternative hypotheses.
2. Select a random sample and record the 2. Select a random sample and record the observedobserved
frequency, frequency, ffi i , for each of the , for each of the kk categories. categories.
3. Assuming 3. Assuming HH00 is true, compute the expected is true, compute the expected frequency, frequency, eei i , in each category by multiplying , in each category by multiplying the category probability by the sample size.the category probability by the sample size.
continuedcontinued
4 4 Slide Slide
Goodness of Fit Test:Goodness of Fit Test:A Multinomial PopulationA Multinomial Population
4. 4. Compute the value of the test statistic.Compute the value of the test statistic.
5. Reject 5. Reject HH00 if if
(where (where is the significance level and there is the significance level and there are are kk - 1 degrees of freedom). - 1 degrees of freedom).
22
1
( )f ee
i i
ii
k2
2
1
( )f ee
i i
ii
k
2 2 2 2
5 5 Slide Slide
Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)
Multinomial Distribution Goodness of Fit TestMultinomial Distribution Goodness of Fit Test
Finger Lakes Homes manufactures four models Finger Lakes Homes manufactures four models of of
prefabricated homes, a two-story colonial, a prefabricated homes, a two-story colonial, a ranch, aranch, a
split-level, and an A-frame. To help in productionsplit-level, and an A-frame. To help in production
planning, management would like to determine ifplanning, management would like to determine if
previous customer purchases indicate that there previous customer purchases indicate that there is ais a
preference in the style selected.preference in the style selected.
6 6 Slide Slide
Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)
Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest
The number of homes sold of each The number of homes sold of each model for 100model for 100
sales over the past two years is shown sales over the past two years is shown below.below.
Model Colonial Ranch Split-Level A-Model Colonial Ranch Split-Level A-FrameFrame
# Sold# Sold 30 30 20 35 20 35 15 15
7 7 Slide Slide
Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)
Multinomial Distribution Goodness of Fit TestMultinomial Distribution Goodness of Fit Test
• NotationNotation
ppCC = popul. proportion that purchase a colonial = popul. proportion that purchase a colonial
ppRR = popul. proportion that purchase a ranch = popul. proportion that purchase a ranch
ppSS = popul. proportion that purchase a split-level = popul. proportion that purchase a split-level
ppAA = popul. proportion that purchase an A-frame = popul. proportion that purchase an A-frame
• HypothesesHypotheses
HH00: : ppCC = = ppRR = = ppSS = = ppAA = .25 = .25
HHaa: The population proportions are not: The population proportions are not
ppCC = .25, = .25, ppRR = .25, = .25, ppSS = .25, and = .25, and ppAA = .25 = .25
8 8 Slide Slide
Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)
Multinomial Distribution Goodness of Fit TestMultinomial Distribution Goodness of Fit Test
• Expected FrequenciesExpected Frequencies
ee11 = .25(100) = 25 = .25(100) = 25 ee22 = .25(100) = = .25(100) = 2525
ee33 = .25(100) = 25 = .25(100) = 25 ee44 = .25(100) = = .25(100) = 2525
• Test StatisticTest Statistic
= 1 + 1 + 4 + 4 = 1 + 1 + 4 + 4
= 10= 10
22 2 2 230 25
2520 25
2535 25
2515 25
25 ( ) ( ) ( ) ( )2
2 2 2 230 2525
20 2525
35 2525
15 2525
( ) ( ) ( ) ( )
9 9 Slide Slide
Multinomial Distribution Goodness of Fit TestMultinomial Distribution Goodness of Fit Test
• Rejection RuleRejection Rule
With With = .05 = .05 andand
kk - 1 = 4 - 1 = 3 - 1 = 4 - 1 = 3
degrees of freedomdegrees of freedom
22
7.81 7.81
Do Not Reject H0Do Not Reject H0 Reject H0Reject H0
Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)
10 10 Slide Slide
Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)
Multinomial Distribution Goodness of Fit TestMultinomial Distribution Goodness of Fit Test
• ConclusionConclusion
22 = 10 > 7.81, so we = 10 > 7.81, so we rejectreject the the assumption there isassumption there is
no home style preference, at the .05 no home style preference, at the .05 level of level of significance.significance.
11 11 Slide Slide
Test of Independence: Contingency Test of Independence: Contingency TablesTables
1. 1. Set up the null and alternative hypotheses.Set up the null and alternative hypotheses.
2. Select a random sample and record the 2. Select a random sample and record the observedobserved
frequency, frequency, ffij ij , for each cell of the contingency , for each cell of the contingency table.table.
3. Compute the expected frequency, 3. Compute the expected frequency, eeij ij , for , for each cell. each cell.
ei j
ij (Row Total )(Column Total )
Sample Sizee
i jij
(Row Total )(Column Total ) Sample Size
12 12 Slide Slide
Test of Independence: Contingency Test of Independence: Contingency TablesTables
4. 4. Compute the test statistic.Compute the test statistic.
5. Reject 5. Reject HH00 if (where if (where is the is the significance level and with significance level and with nn rows and rows and mm columns there are columns there are
((nn - 1)( - 1)(mm - 1) degrees of freedom). - 1) degrees of freedom).
22
( )f e
eij ij
ijji2
2
( )f e
eij ij
ijji
2 2 2 2
13 13 Slide Slide
Example: Finger Lakes Homes (B)Example: Finger Lakes Homes (B)
Contingency Table (Independence) TestContingency Table (Independence) Test Each home sold can be classified according to Each home sold can be classified according to
price and to style. Finger Lakes Homes’ manager price and to style. Finger Lakes Homes’ manager would like to determine if the price of the home and would like to determine if the price of the home and the style of the home are independent variables.the style of the home are independent variables.
The number of homes sold for each model and The number of homes sold for each model and price for the past two years is shown below. For price for the past two years is shown below. For convenience, the price of the home is listed as either convenience, the price of the home is listed as either $65,000 or less $65,000 or less or or more than $65,000more than $65,000..
Price Colonial Ranch Split-Level A-Price Colonial Ranch Split-Level A-FrameFrame
<< $65,000 18 $65,000 18 6 19 6 19 12 12
> $65,000 12 14 16 > $65,000 12 14 16 33
14 14 Slide Slide
Contingency Table (Independence) TestContingency Table (Independence) Test
• HypothesesHypotheses
HH00: Price of the home : Price of the home isis independent of independent of the style the style of the home that is purchased of the home that is purchased
HHaa: Price of the home : Price of the home is notis not independent of theindependent of the
style of the home that is purchasedstyle of the home that is purchased
• Expected FrequenciesExpected Frequencies
PricePrice Colonial Ranch Split-Level A-Frame Colonial Ranch Split-Level A-Frame Total Total
<< $99K 18 $99K 18 6 19 12 6 19 12 55 55
> $99K 12 14 16 > $99K 12 14 16 3 45 3 45
Total 30 20 35 Total 30 20 35 15 10015 100
Example: Finger Lakes Homes (B)Example: Finger Lakes Homes (B)
15 15 Slide Slide
Contingency Table (Independence) TestContingency Table (Independence) Test
• Test StatisticTest Statistic
= .1364 + 2.2727 + . . . + 2.0833 = = .1364 + 2.2727 + . . . + 2.0833 = 9.14869.1486
• Rejection RuleRejection Rule
With With = .05 = .05 and (2 - 1)(4 - 1) = 3 d.f., and (2 - 1)(4 - 1) = 3 d.f.,
Reject Reject HH00 if if 22 > 7.81 > 7.81
• ConclusionConclusion
We reject We reject HH00, the assumption that the price , the assumption that the price of the of the home is independent of the style of the home is independent of the style of the home home that is purchased.that is purchased.
22 2 218 16 5
16 56 11
113 6 75
6 75 ( . )
.( )
. .( . )
. . 2
2 2 218 16 516 5
6 1111
3 6 756 75
( . ).
( ). .
( . ).
.
. .052 7 81. .052 7 81
Example: Finger Lakes Homes (B)Example: Finger Lakes Homes (B)
16 16 Slide Slide
Goodness of Fit Test: Poisson DistributionGoodness of Fit Test: Poisson Distribution
1. 1. Set up the null and alternative hypotheses.Set up the null and alternative hypotheses.
2. Select a random sample and2. Select a random sample and
a. Record the observed frequency, a. Record the observed frequency, ffi i , for each , for each of theof the
k k values of the Poisson random variable.values of the Poisson random variable.
b. Compute the mean number of occurrences, b. Compute the mean number of occurrences, ..
3. Compute the expected frequency of 3. Compute the expected frequency of occurrences, occurrences, eei i , for each value of the Poisson , for each value of the Poisson random variable.random variable.
continuedcontinued
17 17 Slide Slide
Goodness of Fit Test: Poisson DistributionGoodness of Fit Test: Poisson Distribution
4. 4. Compute the value of the test statistic.Compute the value of the test statistic.
5. Reject 5. Reject HH00 if if
(where (where is the significance level is the significance level and there and there are are kk - 2 degrees of freedom). - 2 degrees of freedom).
22
1
( )f ee
i i
ii
k2
2
1
( )f ee
i i
ii
k
2 2 2 2
18 18 Slide Slide
Example: Troy Parking GarageExample: Troy Parking Garage
Poisson Distribution Goodness of Fit TestPoisson Distribution Goodness of Fit Test
In studying the need for an additional In studying the need for an additional entrance to a city parking garage, a consultant entrance to a city parking garage, a consultant has recommended an approach that is has recommended an approach that is applicable only in situations where the number applicable only in situations where the number of cars entering during a specified time period of cars entering during a specified time period follows a Poisson distribution.follows a Poisson distribution.
19 19 Slide Slide
Example: Troy Parking GarageExample: Troy Parking Garage
Poisson Distribution Goodness of Fit TestPoisson Distribution Goodness of Fit Test
A random sample of 100 one-minute time A random sample of 100 one-minute time intervals resulted in the customer arrivals intervals resulted in the customer arrivals listed below. A statistical test must be listed below. A statistical test must be conducted to see if the assumption of a conducted to see if the assumption of a Poisson distribution is reasonable.Poisson distribution is reasonable.
# Arrivals# Arrivals 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 11 1210 11 12
Frequency 0 1 4 10 14 20 12 12 9 8 Frequency 0 1 4 10 14 20 12 12 9 8 6 3 1 6 3 1
20 20 Slide Slide
Example: Troy Parking GarageExample: Troy Parking Garage
Poisson Distribution Goodness of Fit TestPoisson Distribution Goodness of Fit Test
• HypothesesHypotheses
HH00: Number of cars entering the garage : Number of cars entering the garage duringduring
a one-minute interval is Poisson a one-minute interval is Poisson distributed.distributed.
HHaa: Number of cars entering the garage : Number of cars entering the garage during a during a one-minute interval is one-minute interval is notnot Poisson distributed Poisson distributed
21 21 Slide Slide
Example: Troy Parking GarageExample: Troy Parking Garage
Poisson Distribution Goodness of Fit TestPoisson Distribution Goodness of Fit Test
• Estimate of Poisson Probability FunctionEstimate of Poisson Probability Function
otal Arrivals = 0(0) + 1(1) + 2(4) + otal Arrivals = 0(0) + 1(1) + 2(4) + . . .. . . + + 12(1) = 60012(1) = 600
Total Time Periods = 100Total Time Periods = 100
Estimate of Estimate of = 600/100 = 6 = 600/100 = 6
Hence, Hence, f x
ex
x
( )!
6 6
f xex
x
( )!
6 6
22 22 Slide Slide
Example: Troy Parking GarageExample: Troy Parking Garage
Poisson Distribution Goodness of Fit TestPoisson Distribution Goodness of Fit Test
•Expected FrequenciesExpected Frequencies
xx f f ((x x )) xf xf ((x x )) xx f f ((x x )) xf xf ((x x ))
00 .0025.0025 .25 .25 7 7 .1389.1389 13.8913.89
11 .0149.0149 1.49 1.49 8 8 .1041.1041 10.4110.41
22 .0446.0446 4.46 4.46 9 9 .0694.0694 6.946.94
33 .0892.0892 8.92 8.92 1010 .0417.0417 4.174.17
44 .1339.1339 13.3913.39 1111 .0227.0227 2.272.27
55 .1620.1620 16.2016.20 1212 .0155.0155 1.551.55
66 .1606.1606 16.0616.06 Total Total 1.00001.0000 100.00100.00
23 23 Slide Slide
Example: Troy Parking GarageExample: Troy Parking Garage
Poisson Distribution Goodness of Fit TestPoisson Distribution Goodness of Fit Test• Observed and Expected FrequenciesObserved and Expected Frequencies
ii ffii eeii ffii - - eeii
0 or 1 or 20 or 1 or 2 55 6.206.20 -1.20-1.2033 1010 8.928.92 1.081.0844 1414 13.3913.39 .61.6155 2020 16.2016.20 3.803.8066 1212 16.0616.06 -4.06-4.0677 1212 13.8913.89 -1.89-1.8988 99 10.4110.41 -1.41-1.4199 88 6.946.94 1.061.06
10 or more10 or more 1010 7.997.99 2.012.01
24 24 Slide Slide
Poisson Distribution Goodness of Fit TestPoisson Distribution Goodness of Fit Test
• Test StatisticTest Statistic
• Rejection RuleRejection Rule
With With = .05 and = .05 and kk - - pp - 1 = 9 - 1 - 1 = 7 - 1 = 9 - 1 - 1 = 7 d.f. (where d.f. (where kk = number of categories and = number of categories and pp = number of population = number of population parameters parameters estimated), estimated),
Reject Reject HH00 if if 22 > 14.07 > 14.07
• ConclusionConclusion
We cannot reject We cannot reject HH00. There’s no reason to . There’s no reason to doubtdoubt the assumption of a Poisson distribution.the assumption of a Poisson distribution.
22 2 21 20
6 201 088 92
2 017 99
3 42 ( . ).
( . ).
. . .( . )
.. 2
2 2 21 206 20
1 088 92
2 017 99
3 42 ( . ).
( . ).
. . .( . )
..
. .052 14 07 . .052 14 07
Example: Troy Parking GarageExample: Troy Parking Garage
25 25 Slide Slide
Goodness of Fit Test: Normal DistributionGoodness of Fit Test: Normal Distribution
1. 1. Set up the null and alternative hypotheses.Set up the null and alternative hypotheses.
2. Select a random sample and2. Select a random sample and
a. Compute the mean and standard deviation.a. Compute the mean and standard deviation.
b. Define intervals of values so that the b. Define intervals of values so that the expectedexpected
frequency is at least 5 for each interval. frequency is at least 5 for each interval.
c. For each interval record the observed c. For each interval record the observed frequenciesfrequencies
3. Compute the expected frequency, 3. Compute the expected frequency, eei i , for each , for each interval.interval.
continuedcontinued
26 26 Slide Slide
Goodness of Fit Test: Normal DistributionGoodness of Fit Test: Normal Distribution
4. 4. Compute the value of the test statistic.Compute the value of the test statistic.
5. Reject 5. Reject HH00 if if
(where (where is the significance level is the significance level
and there are and there are kk - 3 degrees of freedom). - 3 degrees of freedom).
22
1
( )f ee
i i
ii
k2
2
1
( )f ee
i i
ii
k
2 2 2 2
27 27 Slide Slide
Example: Victor ComputersExample: Victor Computers
Normal Distribution Goodness of Fit TestNormal Distribution Goodness of Fit Test
Victor Computers manufactures and sells aVictor Computers manufactures and sells a
general purpose microcomputer. As part of a general purpose microcomputer. As part of a study to evaluate sales personnel, study to evaluate sales personnel, management wants to determine if the annual management wants to determine if the annual sales volume (number of units sold by a sales volume (number of units sold by a salesperson) follows a normal probability salesperson) follows a normal probability distribution.distribution.
28 28 Slide Slide
Example: Victor ComputersExample: Victor Computers
Normal Distribution Goodness of Fit TestNormal Distribution Goodness of Fit Test
A simple random sample of 30 of the A simple random sample of 30 of the salespeople was taken and their numbers of salespeople was taken and their numbers of units sold are below.units sold are below.
33 43 44 45 52 52 56 58 33 43 44 45 52 52 56 58 63 6463 64
64 65 66 68 70 72 73 73 64 65 66 68 70 72 73 73 74 7574 75
83 84 85 86 91 92 94 98 83 84 85 86 91 92 94 98 102 105102 105
(mean = 71, standard deviation = (mean = 71, standard deviation = 18.54)18.54)
29 29 Slide Slide
Normal Distribution Goodness of Fit TestNormal Distribution Goodness of Fit Test
• HypothesesHypotheses
HH00: The population of number of units : The population of number of units sold sold has a normal distribution has a normal distribution with mean 71 with mean 71 and standard and standard deviation 18.54.deviation 18.54.
HHaa: The population of number of units : The population of number of units sold sold does does notnot have a normal have a normal distribution with distribution with mean 71 and standard mean 71 and standard deviation 18.54.deviation 18.54.
Example: Victor ComputersExample: Victor Computers
30 30 Slide Slide
Normal Distribution Goodness of Fit TestNormal Distribution Goodness of Fit Test
• Interval DefinitionInterval Definition
To satisfy the requirement of an To satisfy the requirement of an expected expected
frequency of at least 5 in each interval frequency of at least 5 in each interval we we
will divide the normal distribution into will divide the normal distribution into 30/5 = 630/5 = 6
equal probability intervals.equal probability intervals.
Example: Victor ComputersExample: Victor Computers
31 31 Slide Slide
Example: Victor ComputersExample: Victor Computers
Normal Distribution Goodness of Fit TestNormal Distribution Goodness of Fit Test
• Interval DefinitionInterval Definition
Areas = 1.00/6 = .1667
Areas = 1.00/6 = .1667
717153.0253.0263.0363.03 78.9778.97
88.98 = 71 + .97(18.54)88.98 = 71 + .97(18.54)
32 32 Slide Slide
Normal Distribution Goodness of Fit TestNormal Distribution Goodness of Fit Test
• Observed and Expected FrequenciesObserved and Expected Frequencies
ii ffii eeii ffii - - eeii
Less than 53.02Less than 53.02 66 55 11
53.02 to 63.0353.02 to 63.03 33 55 -2-2
63.03 to 71.0063.03 to 71.00 66 55 11
71.00 to 78.9771.00 to 78.97 55 55 00
78.97 to 88.9878.97 to 88.98 44 55 -1-1
More than 88.98More than 88.98 66 55 11
TotalTotal 3030 3030
Example: Victor ComputersExample: Victor Computers
33 33 Slide Slide
Normal Distribution Goodness of Fit TestNormal Distribution Goodness of Fit Test
• Test StatisticTest Statistic
• Rejection RuleRejection Rule
With With = .05 and = .05 and kk - - pp - 1 = 6 - 2 - 1 = 3 d.f., - 1 = 6 - 2 - 1 = 3 d.f.,
Reject Reject HH00 if if 22 > 7.81 > 7.81
• ConclusionConclusion
We cannot reject We cannot reject HH00. There is little evidence . There is little evidence to to
support rejecting the assumption the support rejecting the assumption the population population is normally distributed with is normally distributed with = 71 = 71 and and = 18.54. = 18.54.
22 2 2 2 2 21
525
15
05
15
15
1 60 ( ) ( ) ( ) ( ) ( ) ( ).2
2 2 2 2 2 215
25
15
05
15
15
1 60 ( ) ( ) ( ) ( ) ( ) ( ).
. .052 7 81 . .052 7 81
Example: Victor ComputersExample: Victor Computers
34 34 Slide Slide
End of Chapter 12End of Chapter 12