Evaluation of public policies: alternative methods. Second week. Topic 1. Intro and Diff in Diff Orazio Attanasio and Marcos Vera- Hernandez Los Andes

Evaluation of public policies:alternative methods.

Second week . Topic 1.Intro and Diff in Diff

Orazio Attanasio and Marcos Vera-Hernandez

Los Andes July 14- July 25 2008

Outline

• Step back to look at the evaluation problem and state where we stand.

• Provide basic notation again.

• Difference in Difference

Blundell, R. and M. Costa-Diaz: “Alternative Approaches to Evaluation In Empirical Microeconomics” December 2007.http://www.ucl.ac.uk/~uctp39a/Blundell-CostaDias-Dec-2007.pdf

http://www.ucl.ac.uk/~uctp39a/Blundell-CostaDias-Dec-2007.pdf

Outline

• Regression Discontinuity Design

• Instrumental Variables

• Control Functions

• Structural Models for Policy Evaluation

The basic evaluation problem

• We want to establish how a specific outcome on an individual who receives or is exposed to a public policy is affected by the policy.

• The ‘policy’ can be intended quite widely– For example estimating the return to education can be

seen as an evaluation problem.

• The basic difficulty in establishing the counterfactual: – What the outcome would have been in the absence of

a policy

The fundamental evaluation problem: some basic notation

ititiititititit

itit

itiit

udydydy

OutcomesObserved

uy

uy

OutcomesPotential

01

0

1

)1(

:

:

The fundamental evaluation problem: some basic notation


• The ‘counterfactual problem’ cannot be solved at the individual level.

• Randomization is classic solution.

• This would allow us to estimate ‘average treatment effects’.

• Or, related parameters, (ATT).

Estimation of treatment effects

• Different alternatives differ in the amount of structure one wants to put into the problem.

• Randomization: – no structure or special assumptions (almost)– No ability to extrapolate.

• Matching:– Selection on observable– Allow for heterogeneous treatment effects– No unobservable differences between treatment and control

• Difference in Difference:– allow for unobservable but constant through time

Estimation of treatment effects

• Instrumental variables:– Model participation to correct selection bias– Homogeneous effects?

• Control functions– Model participation– Allow more naturally for heterogeneous effects,– Allow more easily for non linear models

• Structural models– Strong assumptions– Allow extrapolation and simulations of program

changes

The basic evaluation problem: additional notation

The basic evaluation problem: additional notation


• The ‘counterfactual problem’ is common to many sciences, not only social sciences.

• Why is it particularly difficult in social sciences?– Randomization is seldom available– There are good reason to believe that in many

situations there is correlation between u and d


Homogeneous v heterogeneous treatment

• Homogeneous effects

Homogeneous v heterogeneous treatment

• Heterogeneous effects

Difference in difference and natural experiments

• Natural experiments are changes in legislation or other historical accidents that can be interpreted as ‘random’.

• That is they can be interpreted as affecting participation without affecting outcomes.

• They can therefore be used to create ‘control’ groups.

Changes over time

• Treated individuals observed before the program can constitute a ‘control’ for themselves.

• We can interpret before /after comparison as identifying treatment effects if there is nothing that changes the outcome variable except the program.

• Of course this is a very strong and untestable assumption.

Difference in difference

• But if we have one group affected by the program and one that is not and we observed them twice, once before and once after the program:

• We can use the pre-program differences to estimate permanent differences between the groups.

• The comparison after the program can then be

‘corrected’ to identify the effect of the program

• Notice that the diff in diff estimator can be obtained both when we have longitudinal data and when we have repeated cross sections from representative groups


otherwise

ktfordifd

grouptreatmentidentifiesd

ttt

tandtatDatatatchangePolice

iti

i

k

k

0

11

.

..

10

10


Assume:





• Notice that the previous equation can be estimated by OLS consistently on longitudinal data.

• However, the same equation can be estimated from repeated cross sections.

Problems with DiD

• Ashenfelter’s dip.

• The procedure does not control for temporary shocks that affect participation.

• This can seriously bias the procedure.

Problems with DiD:Different macro trends.

• It is possible that the two groups are affected differently by time trends

• This would bias the estimator (see graphical example)

• We can check for this with pre-program data.

• We could also ‘correct’, under some assumptions, for this bias.

Problems with DID

• Compositional changes when longitudinal data are not available

Non linear DID

• The DiD idea relies on additive components to get the outcome variable.

• This assumption can be unrealistic, especially in the case of dummy variables.

Non linear DID

oit follows a distribution F with inverse F-1

Non linear DID

• For simplicity assume the absence of heterogeneous fixed effects – But still maintain heterogeneous impacts)

• Under normality (F is the normal distribution), the model becomes a standard probit

• The temptation to obtain the diff in diff is to run a simple probit on the pooled data (t=0 and t=1), with time dummies, group dummies and interaction.

• Could the interaction coefficient be interpreted as the impact as in the linear (OLS) case?

• NO!!!!

Non linear DID

Require distributional assumptions on this term

You cannot do standard probit

Non linear DID

Non linear DID

Non linear DID

Additional tricks

• Combining matching and DID

• Could DiD be the right thing to do in the case of a randomized experiment?

• Would one to controls to a randomized experiment?

Combining matching and DiD

• One could allow the constant β to be a function of observables.

• One could also assume that the effects are a function of these observables.

• Therefore one could use matching techniques on first differenced data – (if longitudinal data are available)

• Or use repeated cross sections.

Combining matching and DiD

Would DiD the right thing to do in a randomized trial?

• Yes if it turns out that the randomization sample is small. – It could be that the treatment and control samples are

imbalanced out of bad luck

• What about efficiency arguments:– It depends. It could go either way.

),(2)(2)(

);()()(

;

1

itititit

itiit

itititiit

ooCovoVaruVar

oVarVaruVar

ouou

Example of diff in diff evaluation:The evaluation of Familias en Accion

• Familias en Accion is a CCT program started in Colombia in 2002 with a loan from the World Bank and the IADB

• The evaluation was one of the conditions imposed by the loan

• The antecedent was PROGRESA in Mexico, where the expansion phase of the program was used to build an evaluation based on the randomized allocation of the program across communities.


• The expansion phase in Colombia was also quite long.

• However, for political reasons it was not possible to randomize across communities.

• It was decided to use a treatment/control comparison and use diff in diff combined with matching techniques.


• The treatment municipios were chosen by the government.

• Treatment list (627 municipalities):– Municipios with less than 100k (no capitals), – with enough infrastructure, – with complete sisben registry as of Dec. 1999,– With a bank!

• Treatment sample:– Representative list of municipalities– Stratified by region (5 regions) and by level of infrastructure (5)

• Control sample:– Municipalities in the same strata used to stratify the treatment

sample that were ‘similar’ to the treatment municipalities actually in the sample.


• The government became very anxious to start the program quickly in 2002 (approaching elections? End of mandate effects? Crisis? )

• In early 2002 became clear that it was difficult to collect a baseline before the start of the program in the treatment municipalities.

• Two decisions:– Strong negotiations with the government (intermediated by the

WB and the IADB) to prevent the start of the program in at least some municipalities

– Introduction in the questionnaire of retrospective questions.• Possible problems:

– Low power (reduction in sample size)– Anticipation effects,


• For the evaluation we collected three surveys.– Baseline 2002 (but some treatment were

already receiving payments)– First Follow up 2003 (attrition 6%)– Second follow up 2005/6 (attrition 10%)

• Very long household survey (11,500 hh) complemented with surveys of schools, hospitals, health centres, localities, HC’s.


Y(i,t) = e D + bt + gt + u

Ejemplo 1: gT =gC:

i= Tratamiento i=Control

t=1 YT1=b1 + gC + u YC1=b1 + gC + u

t=2 YT2=e+b2 + gC + u YC2=b2 + gC + u

e=YT2-YC2

Ejemplo 2: gT ≠ gC:

i=Tratamiento i=Control

t=1 YT1=b1 + gT + u YC1=b1 + gC + u

t=2 YT2=e+b2 + gT + u YC2=b2 + gC + u

YT2-YC2 = e+gT-gC

YT1-YC1 = gT-gC

e= (YT2-YC2 ) – (YT1-YC1 )

Diferencia en intensidad de la intervención.

Extensiones necesarias

El programa empezó a operar en algunos municipios

tratamiento antes de la línea de base:

Aunque esto permitió la precoz evaluación de resultados de

impacto, ahora da lugar a problemas metodológicos y de

eficiencia.

TCP y TSP

TSP TCP Control

Pre- línea de base

D=0 D=0 D=0

Línea de Base

D=0 (pre-registrado)

D=1 D=0

Primer Seguimiento

D=1 D=1 D=0

Tenemos TCP y TSP Para unas variables tenemos frecuencias pre – línea de base

TSP TCP ControlPre-Línea de Base

Línea de Base

Primer Seguimento

TCP y TSP (cont)

YP1=e+b1 + gT + u

YT2=e+b2 + gT + u YP2=e+b2 + gT + u

YC1=b1 + gC + u

YT1=b0 + gT + u YP0=b0 + gT + u YC0=b0 + gC + u

YT1=b1 + gT + u

YC2=b2 + gC + u

TSP TCP ControlLínea de Base

YT1=b1 + gT + u YP1=e+b1 + gT + u YC1=b1 + gC + u

Primer Seguimiento

YT2=e+b2 + gT + u YP2=e+b2 + gT + u YC2=b2 + gC + u

e=(YT2-YC2 ) – (YT1-YC1 ) e=(YT2-YP2 ) – (YT1-YP1 )

Estas estimaciones pueden ser combinadas para incrementar la eficiencia

TSP y TCP (cont)

TSP TCP Control

Línea de Base

Primer Seguimiento

e1 = (YT2-YC2 ) – (YT1-YC1 ) e2 =(YT2-YP2 ) – (YT1-YP1 )

e1 es el efecto del programa después de un periodoe2 es el efecto del programa después de un periodo

Posibles Efectos Tempranos (cont)

YP1=e1+b1 + gT + u

YT2=e1+b2 + gT + u YP2=e2+b2 + gT + u

YT1=b1 + gT + u YC1=b1 + gC + u

YC2=b2 + gC + u

Efectos de Intensidad

.

.

Este marco de referencia puede generalizarse para permitir que el

efecto sea función del número de pagos

Para algunos resultados, se presenta evidencia de que ello es

importante.

Resultado para los cuales el efecto acumulado es importante: estado

nutricional


Efectos de Intensidad (cont)

TSP TCP Control

Pre – Línea de Base

Línea de Base

Primer Seguimiento

e=(YT2-YC2 ) – (YT0-YC0 ) e=(YP2-YC2 ) – (YP0-YC0 )

e= (YP1-YC1 ) – (YP0-YC0 ) a=(YT1-YC1 ) – (YT0-YC0 )


Efectos de Anticipación

YT1=b1 +a+ gT + u

YT1=b0 + gT + u YP0=b0 + gT + u

YP1=e+b1 + gT + u

YC0=b0 + gC + u

YC1=b1 + gC + u

YC2=b2 + gC + uYP2=e+b2 + gT + uYT2=e+b2 + gT + u

ituitu

itu

Crecimiento en la Talla Promedio.

Nivel de significancia del 10 % o menosNivel de significancia del 5 % o menosNivel de significancia del 1 % o menos

** ** * *

Variable Rural UrbanoTalla para la edad

(Desv standards)

0.167**

(0.08)

0.007

(0.114)

Peso para la edad

(Desv standards)

0.185***

(0.068)

0.024

(0.107)

Impactos positivos en la zona rural, pero nulos en la urbana

Crecimiento en la Talla Promedio.

Variable Rural

0-36

Rural

36-84

Urbano

0-36

Urbano

36-84Talla para la edad

(Desv standards)

0.226

(0.195)

0.141*

(0.076)

-0.151

(0.214)

0.039

(0.124)

Peso para la edad

(Desv standards)

0.255*

(0.149)

0.176**

(0.073)

-0.331 *

(0.178)

0.068

(0.104)

Probabilidad desnutrición crónica

-0.154***

(0.041)

0.003

(0.031)

0.014

(0.015)

-0.001

(0.032)

Probabilidad desnutrición global

-0.011

(0.009)

0.012**

(0.006)

0.00

(0.1)

0.01

(0.008)

Probabilidad riesgo desnutrición crónica

0.016

(0.095)

-0.106 **

(0.049)

-0.007

(0.114)

-0.03

(0.062)

Probabilidad riesgo desnutrición global

-0.16*

(0.086)

-0.128***

(0.041)

0.162**

(0.07)

-0.047

(0.054)

Segundo seguimientoImpacto en la asistencia escolar

8-13, 14-17Urbano Rural

TTO Puros Todos los TTO

TTO Puros Todos los TTO

Edad 8-13 0.0133

(0.0067)*

0.0119

(0.0064)

0.0293

(0.0073)**

0.0282

(0.0076)**

Edad 14-17

0.0526

(0.0195) **

0.0500

(0.0185)**

0.0712

(0.0275)**

0.0805

(0.0243)**

Notas: Estimaciones paramétricas usando los cuatro períodos* Nivel de significancia del 5 % o menos** Nivel de significancia del 1 % o menos

Es el consumo afectado por el Programa?DD. Tratamiento vs Control

Primer Seguimiento Segundo Seguimiento

Consumo total urbano 52,576(13,551)***

25,636.3(18,868.2)

Consumo total rural 53,831.1(18,888)***

39,177.8(15,701.2)**

Consumo de alimentos urbano 37,018(9,898)***

21,813.6(11,194.9)*

Consumo de alimentos rural 41,956.6(1,6075)***

28,418.2(11,331.7)**

p<0.1, ** p<0.05, *** p<0.01Nota: En SS se incluye a los municipios que pasaron de control a tratamiento entre linea de base y segundo seguimiento

Es el consumo afectado por el Programa?DD. Tratamiento vs Control

Primer Seguimiento Segundo Seguimiento

Log consumo total urbano 0.147(0.034)***

0.092

(0.040)**

Log consumo total rural 0.145(0.051)***

0.112

(0.040)***

Log consumo de alimentos urbano 0.158(0.034)***

0.111

(0.039)***

Log consumo de alimentos rural 0.157(0.056)***

0.116

(0.039)***

p<0.1, ** p<0.05, *** p<0.01Nota: En SS se incluye a los municipios que pasaron de control a tratamiento entre linea de base y segundo seguimiento

Documents

Evaluation of public policies: alternative methods. Second week. Topic 1. Intro and Diff in Diff Orazio Attanasio and Marcos Vera- Hernandez Los Andes