Upload
sinead
View
44
Download
0
Embed Size (px)
DESCRIPTION
William Greene Stern School of Business IFS at UCL February 11-13, 2004. Discrete Choice Modeling. http://cemmap.ifs.org.uk/resources/files/resources_greene_discrete.shtml. Part 3. Modeling Binary Choice. A Model for Binary Choice. Yes or No decision (Buy/Not buy) - PowerPoint PPT Presentation
Citation preview
Discrete Choice Modeling
William GreeneStern School of BusinessIFS at UCLFebruary 11-13, 2004
http://cemmap.ifs.org.uk/resources/files/resources_greene_discrete.shtml
Part 3
Modeling Binary Choice
A Model for Binary Choice Yes or No decision (Buy/Not buy) Example, choose to fly or not to fly to a destination
when there are alternatives. Model: Net utility of flying Ufly = +1Cost + 2Time + Income + Choose to fly if net utility is positive Data: X = [1,cost,terminal time] Z = [income]
y = 1 if choose fly, Ufly > 0, 0 if not.
What Can Be Learned from the Data? (A Sample of Consumers, i = 1,…,N)
• Are the attributes “relevant?”
• Predicting behavior
- Individual
- Aggregate
• Analyze changes in behavior when
attributes change
Application 210 Commuters Between Sydney and
Melbourne Available modes = Air, Train, Bus, Car Observed:
Choice Attributes: Cost, terminal time, other Characteristics: Household income
First application: Fly or other
Binary Choice Data
Choose Air Gen.Cost Term Time Income1.0000 86.000 25.000 70.000.00000 67.000 69.000 60.000.00000 77.000 64.000 20.000.00000 69.000 69.000 15.000.00000 77.000 64.000 30.000.00000 71.000 64.000 26.000.00000 58.000 64.000 35.000.00000 71.000 69.000 12.000.00000 100.00 64.000 70.0001.0000 158.00 30.000 50.0001.0000 136.00 45.000 40.0001.0000 103.00 30.000 70.000.00000 77.000 69.000 10.0001.0000 197.00 45.000 26.000.00000 129.00 64.000 50.000.00000 123.00 64.000 70.000
An Econometric Model Choose to fly iff UFLY > 0
Ufly = +1Cost + 2Time + Income + Ufly > 0
> -(+1Cost + 2Time + Income) Probability model: For any person observed by the
analyst, Prob(fly) = Prob[ > -(+1Cost + 2Time + Income)]
Note the relationship between the unobserved and the outcome
A Regression - Like Model
INDEX
.2
.4
.6
.8
1.0
.0-1.8 -.6 .6 1.8 3.0-3.0
Pr[
Fly
]
+1Cost + 2TTime + Income
Econometrics How to estimate , 1, 2, ?
It’s not regression The technique of maximum likelihood
Prob[y=1] =
Prob[ > -(+1Cost + 2Time + Income)] Prob[y=0] = 1 - Prob[y=1]
Requires a model for the probability
0 1Prob[ 0] Prob[ 1]
y yL y y
Completing the Model: F() The distribution
Normal: PROBIT, natural for behavior Logistic: LOGIT, allows “thicker tails” Gompertz: EXTREME VALUE, asymmetric, underlies
the basic logit model for multiple choice Does it matter?
Yes, large difference in estimates Not much, quantities of interest are more stable.
Estimated Binary Choice Model+---------------------------------------------+| Binomial Probit Model || Maximum Likelihood Estimates || Model estimated: Jan 20, 2004 at 04:08:11PM.|| Dependent variable MODE || Weighting variable None || Number of observations 210 || Iterations completed 6 || Log likelihood function -84.09172 || Restricted log likelihood -123.7570 || Chi squared 79.33066 || Degrees of freedom 3 || Prob[ChiSqd > value] = .0000000 || Hosmer-Lemeshow chi-squared = 46.96547 || P-value= .00000 with deg.fr. = 8 |+---------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Index function for probability Constant .43877183 .62467004 .702 .4824 GC .01256304 .00368079 3.413 .0006 102.647619 TTME -.04778261 .00718440 -6.651 .0000 61.0095238 HINC .01442242 .00573994 2.513 .0120 34.5476190
Estimated Binary Choice Models
LOGIT PROBIT EXTREME VALUE
Variable Estimate t-ratio Estimate t-ratio Estimate t-ratio
Constant 1.78458 1.40591 0.438772 0.702406 1.45189 1.34775
GC 0.0214688 3.15342 0.012563 3.41314 0.0177719 3.14153
TTME -0.098467 -5.9612 -0.0477826 -6.65089 -0.0868632 -5.91658
HINC 0.0223234 2.16781 0.0144224 2.51264 0.0176815 2.02876
Log-L -80.9658 -84.0917 -76.5422
Log-L(0) -123.757 -123.757 -123.757
A Regression - Like Model
INDEX
.2
.4
.6
.8
1.0
.0-1.8 -.6 .6 1.8 3.0-3.0
Pr[
Fly
]
+1Cost + 2Time + (Income+1)
Effect on predicted probability of an increase in income
( is positive)
How Well Does the Model Fit? There is no R squared “Fit measures” computed from log L
“pseudo R squared = 1 – logL0/logL Others… - these do not measure fit.
Direct assessment of the effectiveness of the model at predicting the outcome
Fit Measures for Binary Choice Likelihood Ratio Index
Bounded by 0 and 1 Rises when the model is expanded
Cramer (and others)ˆ ˆ ˆ F | = 1 - F | = 0
=
Mean y Mean y reward for correct predictions minus
penalty for incorrect predictions
Fit Measures for the Logit Model+----------------------------------------+| Fit Measures for Binomial Choice Model || Probit model for variable MODE |+----------------------------------------+| Proportions P0= .723810 P1= .276190 || N = 210 N0= 152 N1= 58 || LogL = -84.09172 LogL0 = -123.7570 || Estrella = 1-(L/L0)^(-2L0/n) = .36583 |+----------------------------------------+| Efron | McFadden | Ben./Lerman || .45620 | .32051 | .75897 || Cramer | Veall/Zim. | Rsqrd_ML || .40834 | .50682 | .31461 |+----------------------------------------+| Information Akaike I.C. Schwarz I.C. || Criteria .83897 189.57187 |+----------------------------------------+
Pseudo – R-squared
Predicting the Outcome
Predicted probabilities
P = F(a + b1Cost + b2Time + cIncome) Predicting outcomes
Predict y=1 if P is large Use 0.5 for “large” (more likely than not)
Count successes and failures
Individual Predictions from a Logit Model
Observation Observed Y Predicted Y Residual x(i)b Pr[Y=1]
81 .00000 .00000 .0000 -3.3944 .0325
85 .00000 .00000 .0000 -2.1901 .1006
89 1.0000 .00000 1.0000 -2.6766 .0644
93 1.0000 1.0000 .0000 .8113 .6924
97 1.0000 1.0000 .0000 2.6845 .9361
101 1.0000 1.0000 .0000 2.4457 .9202
105 1.0000 .00000 1.0000 -3.2204 .0384
109 1.0000 1.0000 .0000 .0311 .5078
113 .00000 .00000 .0000 -2.1704 .1024
117 .00000 .00000 .0000 -3.3729 .0332
445 .00000 1.0000 -1.0000 .0295 .5074
Note two types of errors and two types of successes.
Predictions in Binary Choice Predict y = 1 if P > P*
Success depends on the assumed P*
ROC Curve Plot %Y=1 correctly predicted vs. %y=1
incorrectly predicted 450 is no fit. Curvature implies fit. Area under the curve compares models
Aggregate PredictionsFrequencies of actual & predicted outcomes
Predicted outcome has maximum probability.
Threshold value for predicting Y=1 = .5000
Predicted
------ ---------- + -----
Actual 0 1 | Total
------ ---------- + -----
0 151 1 | 152
1 20 38 | 58
------ ---------- + -----
Total 171 39 | 210
Analyzing PredictionsFrequencies of actual & predicted outcomes
Predicted outcome has maximum probability.
Threshold value for predicting Y=1 is P* .5000.
(This table can be computed with any P*.)
Predicted
------ -------------------- + -----
Actual 0 1 | Total
------ ----------------------+-------
0 N(a0,p0) N(a0,p1) | N(a0)
1 N(a1,p0) N(a1,p1) | N(a1)
------ ----------------------+ -----
Total N(p0) N(p1) | N
Analyzing Predictions - Success
Sensitivity = % actual 1s correctly predicted = 100N(a1,p1)/N(a1) % [100(38/58)=65.5%]
Specificity = % actual 0s correctly predicted = 100N(a0,p0)/N(a0) % [100(151/152)=99.3%]
Positive predictive value = % predicted 1s that were actual 1s = 100N(a1,p1)/N(p1) % [100(38/39)=97.4%]
Negative predictive value = % predicted 0s that were actual 0s = 100N(a0,p0)/N(p0) % [100(151/171)=88.3%]
Correct prediction = %actual 1s and 0s correctly predicted = 100[N(a1,p1)+N(a0,p0)]/N [100(151+38)/210=90.0%]
Analyzing Predictions - Failures False positive for true negative = %actual 0s predicted as 1s
= 100N(a0,p1)/N(a0) % [100(1/152)=0.668%]
False negative for true positive = %actual 1s predicted as 0s = 100N(a1,p0)/N(a1) % [100(20/258)=34.5%]
False positive for predicted positive = % predicted 1s that were actual 0s = 100N(a0,p1)/N(p1) % [100(1/39)=2/56%]
False negative for predicted negative = % predicted 0s that were actual 1s = 100N(a1,p0)/N(p0) % [100(20/171)=11.7%]
False predictions = %actual 1s and 0s incorrectly predicted = 100[N(a0,p1)+N(a1,p0)]/N [100(1+20)/210=10.0%]
Aggregate Prediction is a Useful Way to Assess the Importance of a Variable
Frequencies of actual & predicted outcomes. Predicted outcome has maximum probability. Threshold value for predicting Y=1 = .5000
Predicted
------ ---------- + -----
Actual 0 1 | Total
------ ---------- + -----
0 145 7 | 152
1 48 10 | 58
------ ---------- + -----
Total 193 17 | 210
Predicted
------ ---------- + -----
Actual 0 1 | Total
------ ---------- + -----
0 151 1 | 152
1 20 38 | 58
------ ---------- + -----
Total 171 39 | 210
Model fit without TTME
Model fit with TTME