Upload
perlita-mier
View
102
Download
0
Embed Size (px)
Citation preview
Evaluation of public policies:alternative methods.
Second week . Topic 1.Intro and Diff in Diff
Orazio Attanasio and Marcos Vera-Hernandez
Los Andes July 14- July 25 2008
Outline
• Step back to look at the evaluation problem and state where we stand.
• Provide basic notation again.
• Difference in Difference
Blundell, R. and M. Costa-Diaz: “Alternative Approaches to Evaluation In Empirical Microeconomics” December 2007.http://www.ucl.ac.uk/~uctp39a/Blundell-CostaDias-Dec-2007.pdf
Outline
• Regression Discontinuity Design
• Instrumental Variables
• Control Functions
• Structural Models for Policy Evaluation
The basic evaluation problem
• We want to establish how a specific outcome on an individual who receives or is exposed to a public policy is affected by the policy.
• The ‘policy’ can be intended quite widely– For example estimating the return to education can be
seen as an evaluation problem.
• The basic difficulty in establishing the counterfactual: – What the outcome would have been in the absence of
a policy
The fundamental evaluation problem: some basic notation
ititiititititit
itit
itiit
udydydy
OutcomesObserved
uy
uy
OutcomesPotential
01
0
1
)1(
:
:
The fundamental evaluation problem: some basic notation
The basic evaluation problem
• The ‘counterfactual problem’ cannot be solved at the individual level.
• Randomization is classic solution.
• This would allow us to estimate ‘average treatment effects’.
• Or, related parameters, (ATT).
Estimation of treatment effects
• Different alternatives differ in the amount of structure one wants to put into the problem.
• Randomization: – no structure or special assumptions (almost)– No ability to extrapolate.
• Matching:– Selection on observable– Allow for heterogeneous treatment effects– No unobservable differences between treatment and control
• Difference in Difference:– allow for unobservable but constant through time
Estimation of treatment effects
• Instrumental variables:– Model participation to correct selection bias– Homogeneous effects?
• Control functions– Model participation– Allow more naturally for heterogeneous effects,– Allow more easily for non linear models
• Structural models– Strong assumptions– Allow extrapolation and simulations of program
changes
The basic evaluation problem: additional notation
The basic evaluation problem: additional notation
The basic evaluation problem
• The ‘counterfactual problem’ is common to many sciences, not only social sciences.
• Why is it particularly difficult in social sciences?– Randomization is seldom available– There are good reason to believe that in many
situations there is correlation between u and d
The basic evaluation problem
Homogeneous v heterogeneous treatment
• Homogeneous effects
Homogeneous v heterogeneous treatment
• Heterogeneous effects
Difference in difference and natural experiments
• Natural experiments are changes in legislation or other historical accidents that can be interpreted as ‘random’.
• That is they can be interpreted as affecting participation without affecting outcomes.
• They can therefore be used to create ‘control’ groups.
Changes over time
• Treated individuals observed before the program can constitute a ‘control’ for themselves.
• We can interpret before /after comparison as identifying treatment effects if there is nothing that changes the outcome variable except the program.
• Of course this is a very strong and untestable assumption.
Difference in difference
• But if we have one group affected by the program and one that is not and we observed them twice, once before and once after the program:
• We can use the pre-program differences to estimate permanent differences between the groups.
• The comparison after the program can then be
‘corrected’ to identify the effect of the program
• Notice that the diff in diff estimator can be obtained both when we have longitudinal data and when we have repeated cross sections from representative groups
Difference in difference
otherwise
ktfordifd
grouptreatmentidentifiesd
ttt
tandtatDatatatchangePolice
iti
i
k
k
0
11
.
..
10
10
Difference in difference
Assume:
Difference in difference
Difference in difference
Difference in difference
Difference in difference
• Notice that the previous equation can be estimated by OLS consistently on longitudinal data.
• However, the same equation can be estimated from repeated cross sections.
Problems with DiD
• Ashenfelter’s dip.
• The procedure does not control for temporary shocks that affect participation.
• This can seriously bias the procedure.
Problems with DiD:Different macro trends.
• It is possible that the two groups are affected differently by time trends
• This would bias the estimator (see graphical example)
• We can check for this with pre-program data.
• We could also ‘correct’, under some assumptions, for this bias.
Problems with DID
• Compositional changes when longitudinal data are not available
Non linear DID
• The DiD idea relies on additive components to get the outcome variable.
• This assumption can be unrealistic, especially in the case of dummy variables.
Non linear DID
oit follows a distribution F with inverse F-1
Non linear DID
• For simplicity assume the absence of heterogeneous fixed effects – But still maintain heterogeneous impacts)
• Under normality (F is the normal distribution), the model becomes a standard probit
• The temptation to obtain the diff in diff is to run a simple probit on the pooled data (t=0 and t=1), with time dummies, group dummies and interaction.
• Could the interaction coefficient be interpreted as the impact as in the linear (OLS) case?
• NO!!!!
Non linear DID
Require distributional assumptions on this term
You cannot do standard probit
Non linear DID
Non linear DID
Non linear DID
Additional tricks
• Combining matching and DID
• Could DiD be the right thing to do in the case of a randomized experiment?
• Would one to controls to a randomized experiment?
Combining matching and DiD
• One could allow the constant β to be a function of observables.
• One could also assume that the effects are a function of these observables.
• Therefore one could use matching techniques on first differenced data – (if longitudinal data are available)
• Or use repeated cross sections.
Combining matching and DiD
Would DiD the right thing to do in a randomized trial?
• Yes if it turns out that the randomization sample is small. – It could be that the treatment and control samples are
imbalanced out of bad luck
• What about efficiency arguments:– It depends. It could go either way.
),(2)(2)(
);()()(
;
1
itititit
itiit
itititiit
ooCovoVaruVar
oVarVaruVar
ouou
Example of diff in diff evaluation:The evaluation of Familias en Accion
• Familias en Accion is a CCT program started in Colombia in 2002 with a loan from the World Bank and the IADB
• The evaluation was one of the conditions imposed by the loan
• The antecedent was PROGRESA in Mexico, where the expansion phase of the program was used to build an evaluation based on the randomized allocation of the program across communities.
Example of diff in diff evaluation:The evaluation of Familias en Accion
• The expansion phase in Colombia was also quite long.
• However, for political reasons it was not possible to randomize across communities.
• It was decided to use a treatment/control comparison and use diff in diff combined with matching techniques.
Example of diff in diff evaluation:The evaluation of Familias en Accion
• The treatment municipios were chosen by the government.
• Treatment list (627 municipalities):– Municipios with less than 100k (no capitals), – with enough infrastructure, – with complete sisben registry as of Dec. 1999,– With a bank!
• Treatment sample:– Representative list of municipalities– Stratified by region (5 regions) and by level of infrastructure (5)
• Control sample:– Municipalities in the same strata used to stratify the treatment
sample that were ‘similar’ to the treatment municipalities actually in the sample.
Example of diff in diff evaluation:The evaluation of Familias en Accion
• The government became very anxious to start the program quickly in 2002 (approaching elections? End of mandate effects? Crisis? )
• In early 2002 became clear that it was difficult to collect a baseline before the start of the program in the treatment municipalities.
• Two decisions:– Strong negotiations with the government (intermediated by the
WB and the IADB) to prevent the start of the program in at least some municipalities
– Introduction in the questionnaire of retrospective questions.• Possible problems:
– Low power (reduction in sample size)– Anticipation effects,
Example of diff in diff evaluation:The evaluation of Familias en Accion
• For the evaluation we collected three surveys.– Baseline 2002 (but some treatment were
already receiving payments)– First Follow up 2003 (attrition 6%)– Second follow up 2005/6 (attrition 10%)
• Very long household survey (11,500 hh) complemented with surveys of schools, hospitals, health centres, localities, HC’s.
Difference in difference
Y(i,t) = e D + bt + gt + u
Ejemplo 1: gT =gC:
i= Tratamiento i=Control
t=1 YT1=b1 + gC + u YC1=b1 + gC + u
t=2 YT2=e+b2 + gC + u YC2=b2 + gC + u
e=YT2-YC2
Ejemplo 2: gT ≠ gC:
i=Tratamiento i=Control
t=1 YT1=b1 + gT + u YC1=b1 + gC + u
t=2 YT2=e+b2 + gT + u YC2=b2 + gC + u
YT2-YC2 = e+gT-gC
YT1-YC1 = gT-gC
e= (YT2-YC2 ) – (YT1-YC1 )
Diferencia en intensidad de la intervención.
Extensiones necesarias
El programa empezó a operar en algunos municipios
tratamiento antes de la línea de base:
Aunque esto permitió la precoz evaluación de resultados de
impacto, ahora da lugar a problemas metodológicos y de
eficiencia.
TCP y TSP
TSP TCP Control
Pre- línea de base
D=0 D=0 D=0
Línea de Base
D=0 (pre-registrado)
D=1 D=0
Primer Seguimiento
D=1 D=1 D=0
Tenemos TCP y TSP Para unas variables tenemos frecuencias pre – línea de base
TSP TCP ControlPre-Línea de Base
Línea de Base
Primer Seguimento
TCP y TSP (cont)
YP1=e+b1 + gT + u
YT2=e+b2 + gT + u YP2=e+b2 + gT + u
YC1=b1 + gC + u
YT1=b0 + gT + u YP0=b0 + gT + u YC0=b0 + gC + u
YT1=b1 + gT + u
YC2=b2 + gC + u
TSP TCP ControlLínea de Base
YT1=b1 + gT + u YP1=e+b1 + gT + u YC1=b1 + gC + u
Primer Seguimiento
YT2=e+b2 + gT + u YP2=e+b2 + gT + u YC2=b2 + gC + u
e=(YT2-YC2 ) – (YT1-YC1 ) e=(YT2-YP2 ) – (YT1-YP1 )
Estas estimaciones pueden ser combinadas para incrementar la eficiencia
TSP y TCP (cont)
TSP TCP Control
Línea de Base
Primer Seguimiento
e1 = (YT2-YC2 ) – (YT1-YC1 ) e2 =(YT2-YP2 ) – (YT1-YP1 )
e1 es el efecto del programa después de un periodoe2 es el efecto del programa después de un periodo
Posibles Efectos Tempranos (cont)
YP1=e1+b1 + gT + u
YT2=e1+b2 + gT + u YP2=e2+b2 + gT + u
YT1=b1 + gT + u YC1=b1 + gC + u
YC2=b2 + gC + u
Efectos de Intensidad
.
.
Este marco de referencia puede generalizarse para permitir que el
efecto sea función del número de pagos
Para algunos resultados, se presenta evidencia de que ello es
importante.
Resultado para los cuales el efecto acumulado es importante: estado
nutricional
Posibles Efectos Tempranos (cont)
Efectos de Intensidad (cont)
TSP TCP Control
Pre – Línea de Base
Línea de Base
Primer Seguimiento
e=(YT2-YC2 ) – (YT0-YC0 ) e=(YP2-YC2 ) – (YP0-YC0 )
e= (YP1-YC1 ) – (YP0-YC0 ) a=(YT1-YC1 ) – (YT0-YC0 )
Posibles Efectos Tempranos (cont)
Efectos de Anticipación
YT1=b1 +a+ gT + u
YT1=b0 + gT + u YP0=b0 + gT + u
YP1=e+b1 + gT + u
YC0=b0 + gC + u
YC1=b1 + gC + u
YC2=b2 + gC + uYP2=e+b2 + gT + uYT2=e+b2 + gT + u
ituitu
itu
Crecimiento en la Talla Promedio.
Nivel de significancia del 10 % o menosNivel de significancia del 5 % o menosNivel de significancia del 1 % o menos
** ** * *
Variable Rural UrbanoTalla para la edad
(Desv standards)
0.167**
(0.08)
0.007
(0.114)
Peso para la edad
(Desv standards)
0.185***
(0.068)
0.024
(0.107)
Impactos positivos en la zona rural, pero nulos en la urbana
Crecimiento en la Talla Promedio.
Variable Rural
0-36
Rural
36-84
Urbano
0-36
Urbano
36-84Talla para la edad
(Desv standards)
0.226
(0.195)
0.141*
(0.076)
-0.151
(0.214)
0.039
(0.124)
Peso para la edad
(Desv standards)
0.255*
(0.149)
0.176**
(0.073)
-0.331 *
(0.178)
0.068
(0.104)
Probabilidad desnutrición crónica
-0.154***
(0.041)
0.003
(0.031)
0.014
(0.015)
-0.001
(0.032)
Probabilidad desnutrición global
-0.011
(0.009)
0.012**
(0.006)
0.00
(0.1)
0.01
(0.008)
Probabilidad riesgo desnutrición crónica
0.016
(0.095)
-0.106 **
(0.049)
-0.007
(0.114)
-0.03
(0.062)
Probabilidad riesgo desnutrición global
-0.16*
(0.086)
-0.128***
(0.041)
0.162**
(0.07)
-0.047
(0.054)
Segundo seguimientoImpacto en la asistencia escolar
8-13, 14-17Urbano Rural
TTO Puros Todos los TTO
TTO Puros Todos los TTO
Edad 8-13 0.0133
(0.0067)*
0.0119
(0.0064)
0.0293
(0.0073)**
0.0282
(0.0076)**
Edad 14-17
0.0526
(0.0195) **
0.0500
(0.0185)**
0.0712
(0.0275)**
0.0805
(0.0243)**
Notas: Estimaciones paramétricas usando los cuatro períodos* Nivel de significancia del 5 % o menos** Nivel de significancia del 1 % o menos
Es el consumo afectado por el Programa?DD. Tratamiento vs Control
Primer Seguimiento Segundo Seguimiento
Consumo total urbano 52,576(13,551)***
25,636.3(18,868.2)
Consumo total rural 53,831.1(18,888)***
39,177.8(15,701.2)**
Consumo de alimentos urbano 37,018(9,898)***
21,813.6(11,194.9)*
Consumo de alimentos rural 41,956.6(1,6075)***
28,418.2(11,331.7)**
p<0.1, ** p<0.05, *** p<0.01Nota: En SS se incluye a los municipios que pasaron de control a tratamiento entre linea de base y segundo seguimiento
Es el consumo afectado por el Programa?DD. Tratamiento vs Control
Primer Seguimiento Segundo Seguimiento
Log consumo total urbano 0.147(0.034)***
0.092
(0.040)**
Log consumo total rural 0.145(0.051)***
0.112
(0.040)***
Log consumo de alimentos urbano 0.158(0.034)***
0.111
(0.039)***
Log consumo de alimentos rural 0.157(0.056)***
0.116
(0.039)***
p<0.1, ** p<0.05, *** p<0.01Nota: En SS se incluye a los municipios que pasaron de control a tratamiento entre linea de base y segundo seguimiento