Large Sample

Large Sample TheoryLarge Sample Results for

Consistency of S2

Large Sample Results for the Linear Model

Walter Sosa-Escudero

Econ 507. Econometric Analysis. Spring 2009

February 19, 2009

Walter Sosa-Escudero Large Sample Results for the Linear Model


Consistency of S2

Why Large Sample Theory?

= (X X)1X Y , then E() = . Easy, mostly due tolinearity.

What about = g(X,Y ). In many cases, impossible toderive finite sample properties without being too specificabout g(.).



Consistency of S2

Consider H0 : = 0 vs. HA : 6= 0. We relied on checking = 0, based on the distribution of . We had to assume unormal and, through linearity, we got the distribution of .

It is equivalent to think about H0 : = 0 vs. HA : 6= 0for 6= 0 and check whether = 0. Why? If we choose cleverly, the distribution of is much easier to get than thatof . For example, if we set =

n, when n, ' N .



Consistency of S2

Example: X (, 2), H0 : = 0, S2 = (n 1)1

(Xi X)2

X normal:

Z =X 0S2/n

a' tn1

Reject if |z| > c, with Pr(|Z| > c) = .

X not normal:

Z =X 0S2/n

' N(0, 1)

Reject if |z| > c, with Pr(|Z| > c) = .If X is normal, we use the t distribution. The result holds for anysample size.

If X is not normal, we use the normal distribution. The result holdsasymptotically.



Consistency of S2

What do we mean by valid asymptotically?

How relevant is the previous result in practice, when n isnecesarily finite



Consistency of S2

X is normal

We should use the t distributionto compute c.

What if we use the normal?

The approximation works fine.For n = 30 differences arenegligible.



Consistency of S2

X is not-normal

Here we do not haveinformation tocompute c.

What if we use thenormal?

Uniform case: theapproximation workswell!



Consistency of S2

Large sample approximations work well in many cases, evenfar from infinity.

In most cases it is much easier to derive the asymptoticapproximations instead of the finite sample result.



Consistency of S2

Basic Concepts

Convergence in probability:A sequence {zn} of random scalars converges in probability to anon-random constant a if for every > 0:

limnPr [|zn a| > ] = 0

We use the notation znp a, or plim zn = a

It extendes naturally to sequences of random vectors ormatrices, requiring element-by-element convergence.



Consistency of S2

Almost sure convergence:A sequence {zn} of random scalars converges almost surely to aconstant a if

Pr(

limn zn =

)= 1

We use the notation zna.s. a, or plim zn = a

Almost sure convergence implies convergence in probability.



Consistency of S2

Convergence in Distribution A sequence {zn} of random scalarswith distribution cumulative distribution Fn converges indistribution to a random scalar z with distribution F

limnFn = F

at every continuity point of F .

F is the limiting distribution of {zn}.Notation: zn

d z.



Consistency of S2

Law of Large NumbersLet zn be a secuence of random scalars, and construct a newsequence

zn =n

i=1 zin

Kolmogorovs Strong LLN: {zn} and i.i.d. secuence of randomscalars with E(zi) = (exists and is finite). Then zn

a.s. .



Consistency of S2

Central Limit Theorem

Lindeberg-Levy CLT: {zn} and i.i.d. secuence of random scalarswith E(zi) = and V (zi) = 2 (both exist and are finite). Then

nzn

d N(0, 1)

Careful, the theorem does not refer to zn but to atransformation involving n.



Consistency of S2

Useful Results

Continuity: {zn} a sequence of random vectors, z a randomvector, a a vector of constants. Let g(.) be a vector valuedcontinuous function that does not depend on n

znp a g(zn) p g(a), and zn a.s. a g(zn) a.s. g(a)

znd z g(zn) d g(z)

Product rule: znp 0 and xn d x, then znxn p 0

Slutzkys Theorem:

xnd x and yn p , then xn + yn d x+ .

xnd x, An p A, then Anxn d Ax, provided An and xn are

conformable.



Consistency of S2

Cramer Wold Device (restricted): xn a sequence of K 1random vectors. If for any vector


Consistency of S2

Estimators as sequences of random variables

An estimator n is any function of the sample. The sequence{n}n refers to the sequence formed by increasing the sample sizeprogressively.

Consistency: n is consistent for 0 if n 0 (pis weak anda.s. strong consistency.

Asymptotic normality: n is asymptotically normal ifn(n 0) d N(0,).

is the asymptotic variance. Careful with this!Such estimators are usually called

n-consistent.



Consistency of S2

Large Sample Properties of OLS estimators

Aside: some new notation

Let us write our linear model

yi = 1x1i + 2x2i + + KxKi + ui, i = 1, . . . , n

as: yi = xi + ui

where xi is an K 1 vector (x1i, x2i, . . . , xKi).

Note that xixi is a K K matrix. It is easy to check

X X =n

i=1 xixi, X

Y =n

i=1 xiyi

n = (n

i=1 xixi)1 (

ni=1 xiyi)

n refers to the sequence of OLS estimators formed by increasingthe sample size.



Consistency of S2

The Model

Assumptions for asymptotic analysis

1 Linearity: yi = xi0 + ui i = 1, . . . , n.2 Random sample: {xi, ui} is a jointly i.i.d. process.3 Predeterminedness: E(xikui) = 0 for all i and k = 1, . . . ,K.4 Rank condition: x E(xixi) finite positive definite.5 V (xiui) = S finite positive definite.



Consistency of S2

The results

Consistency: np 0.

Asymptotic normality:n(n 0

)p N(0,1x S1x )

Plan: first prove results, then we discuss the assumptions.



Consistency of S2

Consistency

n = 0 + (X X)1X u

= 0 +(X Xn

)1(X un

)

We will show:

n1X u p 0(XXn

)1does not explode

This argument will appear several times in this course!



Consistency of S2

The hth element of(Xun

)is

1n

ni=1

xhiui =1n

ni=1

zi, zi xhiui

with E(zi) = 0, so by Kolmogorovs LLN

1n

ni=1

zi =n

i=1 xhiuin

p E(xhiui) = 0

hence (X un

)p 0



Consistency of S2

The (h, j) element of XXn is

ni=1 xhixji

n . Since E(xhixji) = x,hj ,which exists and is finte, by the LLN (element-by-element)

X Xn

p x

Since x is invertible and by continuity:(X Xn

)1p 1x

Summarizing:

n = 0 +(XXn

)1 (Xun

)p 1x p 0

Hence, by the product rule: np 0



Consistency of S2

Asymptotic normalityThe starting point is now:

n(n 0

)=(X Xn

)1(X un

)We have already shown:(

X Xn

)1p 1x

Now the goal is to find the asymptotic distribution of(Xun

)and

use our continuity results.



Consistency of S2

(X un

)=nX un

is a vector of K VAs. By the Cramer-Wold Device, we will findthe distribution of:

n c(X un

)for every vector c


Consistency of S2

It is easy to check (do it as exersise) that the assumptions imply:

E(zi) = 0V (zi) = cSc


Consistency of S2

Then: n(n 0

)=

(XXn

)1 (Xun

)p 1x d N(0, S)

By Slutzkys theorem and by the linearity property of multivariatenormal variables:

n(n 0

)p N(0,1x S1x )



Consistency of S2

A useful simplification

If we further assume E(u2i |xi) = 20, then using LIE:

S = V (xiui) = E(xiuiuixi) = 20E(xix

i) =

20 x

So the asymptotic variance of n reduces to

1x S1x =

201x x

1x =

201x



Consistency of S2

Comments on the assumptions

1 Predeterminedness vs. Mean independence: E(xikui) = 0 isan orthogonality assumption. If there is a constant in themodel, E(ui) = 0 and this implies Cov(xi, ui) = 0.E(ui|xik) = 0 is a stronger assumption, since it implies thatui is uncorrelated to any (measurable) function of xi.

2 Predeterminedness vs. Strict Exogeneity: our old assumptionwas E(u|X) = 0, which implies E(uiXi) = 0. We arerequiring a bit less than we did to show unbiasedness (careful).

Example: Yt = 1 + 2Yt1 + ut, t = 1, 2, ... E(utYt1) = 0(consistency), but E(u|Y1) 6= 0 (why?). The OLS estimator willbe consistent but biased!



Consistency of S2

3 Rank condition as no multicollinearity in the limit: thisguarantees the invertibility of n1 X X in the limit.

4 Rank condition requires variability in X The rank conditionimplies

1nX X p E(xixi), finite pd.

Consider the following model:

yi = 0 + 11i

+ ui, i = 1, . . . , n

Here the explanatory variable provides less information asn.



Consistency of S2

5 We are not ruling out conditional heteroskedasticity!: the iidassumption implies E(u2i ) is constant (unconditionalhomoskedasticity), but the error term can be conditionallyheteroskedastic.

6 The intercept. Note that

Cov(Xk, ui) = E(Xkiui) E(Xki)E(ui)

If there is a constant in the model, E(X1iui) = E(ui) = 0.Then, this joint with predeterminedness implies nocontemporaneous correlation between explanatory variablesand the error term.

7 Do not confuse consistency with unbiasedness.



Consistency of S2

Extensions

The iid case: No fixed regressors, heteroskedasticity,dependences.

Heterogeneity: moment restrictions (Markov, Lyapunov CLT).

Dependences: martingale difference, ergodic theorem, etc.



Consistency of S2

Consistency of S2

Result: Under Assumptions 1-4, s2p E(u2i ), provided E(u2i )

exists and is finite.

Proof:

s2 =ee

nK =n

nKeen

=n

nKuMun

NowuMu = u(I X(X X)1X )u

Ill let you finish the proof in the homework.


Large Sample TheoryLarge Sample Results for Consistency of S2

Documents

Large Sample