Statistical Analysis of the CAPM I. Sharpe{Linter CAPMfinoek.userweb.mwn.de/lehre/portfolio05/restricted/capm1.pdf · Statistical Analysis of the CAPM I. Sharpe{Linter CAPM Brief

Statistical Analysis of the CAPM

I. Sharpe–Linter CAPM

Brief Review of the Sharpe–Lintner CAPM

• The Sharpe–Lintner CAPM assumes that all

investors act according to the µ−σ rule, can

lend and borrow any desired amount at a

common risk–free rate rf and exhibit per-

fect agreement with respect to the probabil-

ity distribution of asset returns.

• Under these (key) assumptions, the market

portfolio is given by

xm =Σ−1(µ− rf1N)

1′NΣ−1(µ− rf1N)

. (1)

• The central equation of the Sharpe–Lintner

CAPM is a direct consequence of (1) and is

given by

µi = rf + βi(µm − rf), i = 1, . . . , N, (2)

where rf is the risk–free rate, µm is the ex-

pected return of the market portfolio and

βi = COV (Ri, Rm)/σ2m, where Ri is the re-

turn of asset i and Rm is the return of the

market portfolio.

• Equation (2) states that there is a linear re-

lation between the excess return of asset i

(over the risk–free) rate and the excess re-

turn of the market portfolio, with zero inter-

cept.

Framework for Estimation and Testing

• The CAPM relationship (2) is expressed in

terms of expected values, which are not ob-

servable.

• To obtain a model with observable quan-

tities, we describe excess returns using the

excess return market model:

rit = αi + βirm,t + εit i = 1, . . . , N (3)

E(εit) = 0, i = 1, . . . , N (4)

E(εitεjt′) =

σij if t = t′

0 if t 6= t′i, j = 1, . . . , N

(5)

E(rm,tεi,t) = 0, i = 1, . . . , N. (6)

• Here ri,t is the excess return on asset i in

period t (over risk–free rate), and rm,t is the

excess return on the market portfolio in pe-

riod t (over risk–free rate).

• At first glance, the market model we will

be using looks similar to the Single–Index

Model (SIM), but there are important dif-

ferences:

– All returns involved are excess returns over

the risk–free rate rf .

– According to equation (5), the asset–specific

error terms may be correlated. Thus, we

allow for a non-diagonal covariance ma-

trix, Σ, of the vector εt = [ε1t, . . . , εNt]′,

COV (εt) = Σ =

σ21 σ12 · · · σ1N

σ12 σ22 · · · σ2N

... ... . . . ...

σ1N σ2N · · · σ2N

Conditional on the excess return of the

market, we then also have

COV (rt) = Σ, (7)

where rt = [r1t, . . . , r2t]′.

– Note, however, that we still assume that

there is no correlation over time, i.e. E(εtε′t′)

= 0 for t 6= t′, and that the covariance

matrix Σ is constant over time.

• We will assume that the betas are constant

over time. This is by no means self-evident

and can, in principle, be tested using econo-

metric techniques for detecting structural breaks

(see Greene, Chapter 7).

• We will also assume that the error terms fol-

low a multivariate normal distribution, i.e.,

εtiid∼ N(0,Σ). (8)

• The Sharpe–Lintner CAPM implies that the

intercept in the excess return market model

is zero, i.e., α = 0. That is, a test of this

model corresponds to a test of the hypoth-

esis

H0 : αi = 0, i = 1, . . . , N. (9)

• To perform such a test, it is necessary to

estimate the parameters of the model and

to derive an appropriate test statistic.

Test based on Time Series Regression

• Write our excess return market model as

rt = α + βrm,t + εt, t = 1, . . . , T,

εtiid∼ N(0,Σ),

where α = [α1, . . . , αN ]′, and β = [β1, . . . , βN ]′.

• Given our assumptions about the distribu-tional properties of the error process, εt,the density of excess returns, conditional onthe market return, rm,t, is

f(rt|rm,t)

=exp

−1

2(rt −α− βrm,t)′Σ

−1(rt −α− βrm,t)

(2π)N/2|Σ|1/2,

and the joint density is

f(r1, . . . , rT |rm,1, . . . , rT,1) (10)

=

T∏

t=1

f(rt|rm,t)

=

exp

−1

2

T∑t=1

(rt −α− βrm,t)′Σ−1(rt −α− βrm,t)

(2π)NT/2|Σ|T/2

• To estimate the unknown parameters, α, β,

and Σ, of this density, we use the method

of maximum likelihood. To do so, we define

the log–likelihood function, i.e., the log of

the joint density viewed as a function of the

unknown parameters.

• The maximum likelihood estimator is then

found by maximizing this function with re-

spect to its arguments, i.e., the unknown

parameters.

• From (10), the log–likelihood function is

logL(α, β,Σ) (11)

= −NT

2log(2π)− T

2log |Σ|

−1

2

T∑

t=1

(rt −α− βrm,t)′Σ−1

×(rt −α− βrm,t),

which we want to maximize with respect to

α, β and Σ.

• From (11), it is clear that the estimates of

α and β are determined by minimizing

S =T∑

t=1

(rt −α− βrm,t)′Σ−1(rt −α− βrm,t)

=T∑

t=1

r′tΣ

−1rt − 2r′tΣ−1(α + βrm,t)

+(α + βrm,t)′Σ−1(α + βrm,t)

.

• Note that this is a seemingly unrelated re-

gression with identical regressors,∗ so that

the MLE is identical to OLS equation by

equation. In fact, the first order conditions

are

∂S

∂α= −2Σ

−1T∑

t=1

(rt −α− βrm,t) (12)

= 0

∂S

∂β= −2Σ

−1T∑

t=1

rm,t(rt −α− βrm,t)(13)

= 0,

implying

α = r − βrm (14)

∗ See Greene (Chapter 14.2), and Theil (Chapter 7).

and†

β =

∑Tt=1(rt − r)(rm,t − rm)∑T

t=1(rm,t − rm)2(15)

=

∑Tt=1(rt − r)(rm,t − rm)

T σ2m

=

∑Tt=1(rm,t − rm)rt

T σ2m

where

r =1

T

T∑

t=1

rt, rm =1

T

T∑

t=1

rm,t,

σ2m =

1

T

T∑

t=1

(rm,t − rm)2.

†The third equality makes use of the basic identitysxy = T−1

∑t(xt − x)(yt − y) = T−1

∑t(xt − x)yt, which

is easily seen as both expressions are equal to xy− xy.

• To find the MLE of Σ, we make use of the

following differentiation rules for a symmet-

ric matrix X:

∂tr(XA)

∂X= A + A′ − diag(A) (16)

and

∂ log |X|∂X

= 2X−1 − diag(X−1). (17)

The log–likelihood function can be written

as

logL = −NT

2log(2π)− T

2log |Σ|

−1

2

T∑

t=1

tr(ε′Σ−1ε

)

= −NT

2log(2π) +

T

2log |Σ−1|(18)

−1

2

T∑

t=1

tr(Σ−1εε′

), (19)

where εt = rt − α− βrm,t, (18) uses |A−1| =|A|−1, and (19) uses the permutation rule

tr(ABC) = tr(BCA).

Thus, using (16) and (17), we require

∂ logL

∂Σ−1

=T

2[2Σ− diag(Σ)]

−1

2

2

T∑

t=1

εtε′t − diag

T∑

t=1

εtε′t

= 0,

implying

Σ =1

T

T∑

t=1

εtε′t (20)

=1

T

T∑

t=1

(rt − α− βrm,t)(rt − α− βrm,t)′.

• The OLS estimators of α and β are unbi-

ased, normally distributed and have covari-

ance matrices

COV (β) =1

T σ4m

COV

T∑

t=1

(rm,t − rm)rt

=1

T σ4m

T∑

t=1

(rm,t − rm)2COV (rt)

=1

T σ2m

Σ,

and

COV (α) = COV (r − βrm)

= COV

1

T

T∑

t=1

rt − rm

T∑

t=1

(rm,t − rm)rt

T σ2m

=1

T2σ4m

T∑

t=1

[σ2m − rm(rm,t − rm)]2COV (rt)

=1

T2σ4m

T [σ4m + r2mσ2

m]Σ

=1

T

(1 +

r2mσ2

m

)Σ. (21)

• It can be shown that T Σ has a Wishart dis-

tribution, WN(T − 2,Σ), which is a matrix

generalization of the χ2.‡

Moreover, Σ is independent of both α and

β.

‡ See, for example, Zellner (1971).

Testing for α = 0

• We discuss two tests of the null hypothe-

sis α = 0, in historical order. The first is a

likelihood ratio (LR) test relying on asymp-

totic arguments,§ while the second is an ex-

act finite–sample F-test. Subsequently, the

relation between the tests will be considered.

Likelihood Ratio (LR) Test

• To conduct the likelihood ratio test, we first

compute the Maximum Likelihood Estima-

tor under the null hypothesis that α = 0,

which is a regression through the origin. De-

note the corresponding estimators by β0 and

Σ0. They are given by

β0 =

∑Tt=1 rtrm,t∑Tt=1 r2m,t

, (22)

§This has been proposed by Gibbons, M. (1982). Mul-tivariate Tests of Financial Models: A New Approach.Journal of Financial Economics 10, 3-28.

and

Σ0 =1

T

T∑

t=1

ε0t ε0t (23)

=1

T

T∑

t=1

(rt − β0rm,t)(rt − β0rm,t)′,

where ε0t = rt − β0rm,t.

• The Likelihood Ratio Test is based on the

comparison between the log–likelihood val-

ues of the unconstrained model and the con-

strained model.

• More precisely, the LR test statistic is given

by

LR = −2(logL0 − logL1), (24)

where logL0 is the log–likelihood function

of the constrained model, and logL1 is the

log–likelihood function of the unconstrained

model, each evaluated at the respective MLE.

• The asymptotic distribution of LR defined in

(24) is χ2 with degrees of freedom equal to

the number of parameter restrictions implied

by the null hypothesis. In our situation, this

corresponds to N degrees of freedom (N is

the number of assets), because the CAPM

implies that αi = 0 for i = 1, . . . , N .

• Now

logL1 = −NT

2log(2π)− T

2log |Σ1|

−1

2

T∑

t=1

tr

(Σ−11 εε′

)

= −NT

2log(2π)− T

2log |Σ1|

−1

2

T∑

t=1

tr

1

T

T∑

t=1

εtε′t

−1

εε′

= −NT

2log(2π)− T

2log |Σ1|

−T

2tr

T∑

t=1

εtε′t

−1 T∑

t=1

εε′

= −NT

2log(2π)− T

2log |Σ1| −

T

2tr(IN)

= −NT

2(log(2π) + 1)− T

2log |Σ1|.

• By the same line of arguments,

logL0 = −NT

2(log(2π) + 1)− T

2log |Σ0|.

Consequently,

LR = T[log |Σ0| − log |Σ1|

]. (25)

F Test

• The finite–sample F test is based on the

following result:

Result: If N–dimensional random variable X

is N(0,Ω), the N × N random matrix A is

Wishart(T,Ω), and X and A are indepen-

dent, then

T −N + 1

NX ′A−1X ∼ FN,T−N+1, (26)

i.e., the quantity [(T−N+1)/N ]X ′A−1X has

an F distribtuion with N degrees of freedom

in the numerator and T − N + 1 degrees of

freedom in the denominator.

• Using, in (26), X =√

T [1+r2m/σ2m]−1/2α and

A = T Σ, and recalling the results we have

for α (in particular, normality and (21)), the

statistic

J =T −N − 1

N

(1 +

r2mσ2

m

)−1

α′Σ−1

α (27)

has an F distribution with N degrees of free-

dom in the numerator and T −N −1 degrees

of freedom in the denominator, i.e.,¶

J ∼ FN,T−N−1. (28)

¶This was developed in Gibbons/Ross/Shanken(1989): A Test of the Efficiency of a Given Portfolio.Econometrica 57, 1121-1152.

Economic Interpretation of the CAPM F Test

• Apart from following a known finite–sample

distribution, the test statistic J defined in

(27) also has economic interpretation.

• Recall that the key testable implication of

the CAPM is that the market portfolio is a

µ− σ efficient portfolio.

• In the presence of a risk–free rate, this means

that the market portfolio is the tangency

portfolio.

• It can be shown that‖

J =

(T −N − 1

N

)θ?2 − θ2

m

1 + θ2m

, (29)

where θ? is the Sharpe ratio of the ex post

(i.e., using the sample mean vector and the

sample covariance matrix) efficient portfo-

lio formed from the risky assets under study

(including our market proxy) and θm is the

Sharpe ratio of the portfolio used as a mar-

ket proxy in our analysis.

• Equation (29) is particularly interesting be-

cause it uncovers what we are actually test-

ing: We test whether our market proxy is so

far away from the ex post efficient portfolio

that we are not willing to believe that it is

the population tangency portfolio, where the

distance is measured in terms of the Sharpe

Ratio.

‖Gibbons/Ross/Shanken (1989): A Test of the Effi-ciency of a Given Portfolio. Econometrica 57, 1121-1152.

Proof of (29)

• Comparing (27) and (29), the equality be-tween these quantities follows if we showthat α′

Σ−1

α = θ?2 − θ2m.

• Let r = [rm, r′]′. The (sample) covariancematrix of these variables is

V =

[σ2

m σ2mβ′

σ2mβ Σ + σ2

mββ′

]. (30)

• We know that the efficient portfolio usingthe assets in r is characterized by the weightvector

w =V −1r

1′V −1r, (31)

and, thus, it has squared Sharpe ratio

θ?2 =(w′r)2

w′V w=

(r′V −1r)2

r′V −1r= r′V −1r. (32)

• Next, it is easily checked that the inverse of(30) is

V −1 =

[σ−2

m + β′Σ−1β −β′Σ−1

−Σ−1β Σ−1

](33)

• Using (33), we get by straightforward com-

putation, and using (32),

θ?2 = r′V −1r = [rm, r′]V −1[rm, r′]′

=r2mσ2

m+ (r − βrm)′Σ−1(r − βrm)

= θ2m + α′Σ−1α,

recalling that α = r − βrm.

Relation between F and LR tests

• The finite–sample F test can also be inter-

preted as a likelihood ratio test.

• To see this, first note that for the uncon-

strained MLE of β, denoted by β1,

β1 =

∑t(rm,t − rm)rt

T σ2m

=

∑t rm,trt − rm

∑t rt

T σ2m

=r2mσ2

mβ0 −

rmr

σ2m

=r2mσ2

mβ0 −

rm

σ2m

(r − β1rm)− r2mσ2

mβ1

=r2mσ2

mβ0 −

rm

σ2m

α− r2mσ2

mβ1.

Rearranging and using the basic identity

σ2m = r2m − r2m =

1

T

T∑

t=1

r2m,t − r2m,

shows that

β0 = β1 +rm

σ2m + r2m

α. (34)

Inserting (34) into Σ0 (see equation (23))

and noting that the normal equations (12)

and (13) imply

T∑

t=1

(rt − α− β1rm,t)′(1− rmrm,t

r2m + σ2m

)α = 0,

we arrive at

Σ0 = Σ1 +

(σ2

m

r2m + σ2m

)αα′. (35)

From our discussion of the Common Cor-

relation Model, we are already familiar with

the Sherman–Morrison formula for the de-

terminant, |A+uv′| = |A|(1+v′A−1u), which,

applied to (35), gives

|Σ0| = |Σ1|[1 +

σ2m

r2m + σ2m

α′Σ−11 α

]

Thus, (25) may be written as

LR = T log|Σ0||Σ1|

= T log

[1 +

σ2m

r2m + σ2m

α′Σ−11 α

]

= T log

[N

T −N − 1J + 1

],

where J is the F–statistic given by (27), or,

equivalently,

J =T −N − 1

N

[exp

LRT

− 1

], (36)

which, as (36) is a monotonic transforma-

tion of LR, shows that J may also be inter-

preted as a likelihood ratio test.

• As the F–test based on (27) is exact, it is,

for realistic sample sizes, clearly preferable

compared to the likelihood ratio test relying

on asymptotic arguments.

Cross-Sectional Regressions

• The classical tests (with Fama and Mac-

beth (1973)∗∗ one of the most prominent

examples) of the CAPM concentrated on the

implication of the model that “beta” com-

pletely captures the variation of expected ex-

cess returns.

• To outline the procedure, assume initially

that the betas are known.

• Then, for each point of time, t, we esti-

mate a cross-sectional regression, where we

regress the excess returns of the N assets

on the betas,

ri,t = γ0,t + γ1,tβi + εi, i = 1, . . . , N. (37)

This gives us estimates γ0,t and γ1,t.

∗∗Fama, E. and J. MacBeth (1973). Risk, Return,and Equilibrium: Empirical Tests. Journal of Polit-ical Economy 71, 607-636.

• Running the regression (37) for t = 1, . . . , T ,

we obtain the time series of regression coef-

ficients, γ0,tt=1,...,T and γ1,tt=1,...,T .

• Implications of the CAPM are E(γ0) = 0 and

E(γ1) > 0 (positive risk premium), which

can be tested by standard t–tests, assuming

that the returns are normally distributed.

• Note that, in the regression (37), further

variables can be included to investigate whether

beta completely captures cross-sectional vari-

ation in expected excess returns. For exam-

ple, we can consider the extended version of

(37)

ri,t = γ0,t + γ1,tβi + γ2,tβ2i + γ3,tσi + εi,

i = 1, . . . , N. (38)

Here, with the inclusion of the squared βi,

one can test whether there is a nonlinear

relationship between risk and expected (ex-

cess) return (i.e., the security market line is

nonlinear), while the σi represents the firm–

specific risk, which, according to the CAPM,

does not produce excess return. Thus, in

(38), we test for E(γ0) = E(γ2) = E(γ3) =

0.

• The cross-sectional regression methodology

has the drawback that, in practice, the betas

are not known but have to be estimated.

• This introduces an errors-in-variables prob-

lem which gives rise to biased estimators of

the γi’s in the second-pass regression (37)

or (38).

Roll’s Critique

• Roll (1977)†† emphasizes that tests of the

CAPM really only reject the mean–variance

efficiency of the market proxy we use in the

test (recall equation (29)).

• This implies that the CAPM is essentially

untestable, because “the theory is not testable

unless the exact composition of the true mar-

ket portfolio is known and used in the tests.

This implies that the theory is not testable

unless all individual assets are included in the

sample”.

• Roll argues that using a proxy for the market

portfolio is subject to two difficulties: “First,

the proxy itself might be mean–variance ef-

ficient even when the true market portfolio

††R. Roll (1977). A Critique of the Asset Pricing The-ory’s Tests. Part I: On Past and Potential Testabilityof the Theory. Journal of Financial Economics 4, 129-176.

is not. This is a real danger since every

sample will display efficient portfolios that

satisfy perfectly all of the theory’s implica-

tions. (...) On the other hand, the chosen

proxy may turn out to be inefficient; but ob-

viously, this alone implies nothing about the

true market portfolio’s efficiency”.

• Thus, what we essentially test is a joint hy-

pothesis: The CAPM and the hypothsis that

the portfolio used in the tests as the market

proxy is the true market portfolio.

• When we reject this hypothesis, we can con-

clude that either

(a) the CAPM is false, or

(b) the portfolio used was not the true mar-

ket portfolio, or

(c) both (a) and (b).

• Clearly, it is extremely difficult to measure

the “market portfolio”, because this entity

can, in principle, include not just traded fi-

nancial assets, but also consumer durables,

real estate, and human capital.

Homework:

Read the following paper:

http://gsbwww.uchicago.edu/fac/finance/

papers/capm2004a.pdf.

You may also consider this one:

http://minneapolisfed.org/research/QR/QR1941.pdf

References

• Campbell/Lo/MacKinlay (1997). The Econo-

metrics of Financial Markets. Princeton Uni-

versity Press: Princeton.

• Greene, W. H. (2003). Econometric Analy-

sis, Fifth Edition. Upper Saddle River: Pren-

tice Hall.

• Theil, H. (1971). Principles of Economet-

rics. Amsterdam: John Wiley & Sons.

• Zellner, A. (1971). Introduction to Bayesain

Inference in Econometrics. New York: John

Wiley & Sons.

Documents

Statistical Analysis of the CAPM I. Sharpe{Linter CAPMfinoek.userweb.mwn.de/lehre/portfolio05/restricted/capm1.pdf · Statistical Analysis of the CAPM I. Sharpe{Linter CAPM Brief