25
Lecture 9: Markov Switching Models Prof. Massimo Guidolin 20192– Financial Econometrics Winter/Spring 2018

Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

Embed Size (px)

Citation preview

Page 1: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

Lecture 9: Markov Switching Models

Prof. Massimo Guidolin

20192– Financial Econometrics

Winter/Spring 2018

Page 2: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

Overview

2Lecture 9: Markov Switching Models – Prof. Guidolin

Defining a Markov Switching VAR model

Structure and mechanics of Markov Switching: from univariate to multivariate models

Understanding MS models through simulations

MS models as normal mixtures

The properties of Markov chain in MS models

Filtered, smoothed, and predicted state probabilities

Page 3: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

One of the worst problems often plaguing econometric models—regressions, ARMA, VAR, GARCH, etc.—is their instability, the fact that the estimated parametric relations suddenly change over timeo Famous to damage effectiveness of economic policyo Also worrisome in financial forecasting and risk management

o D Four approaches/reactions:

① “Happy go lucky”, ignore it and hope for the best (dashed red lines)② Test for breaks and shifts and just use data after most recent break

Motivation and Introduction

3Lecture 9: Markov Switching Models – Prof. Guidolin

Page 4: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

③ Use rolling window estimation schemeso However, although popular, RW schemes are optimal only under

specific assumptions on how instability occurso RW scheme is optimal only when every period there is a break… odd!④ Model and forecast instability, when recurrent in the form of regimes

In regime switching models (RSM), state variables govern how part or all parameters of a time series framework may change over time

In a specific type of RSM—Markov switching models (MSM)—the state is latent and follows a simple (finite state) Markov chaino MC process = N-branch tree in which the probs. depend on finite history

Definition of a Markov Switching VAR Model

4

Page 5: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o When M = 1 (first-order MC) and we call the KxK matrix collecting the probabilities

the transition matrix of the K-state Markov process Regimes are unobservable (latent) ⟹ even with unlimited time

series information, estimation never reveal the actual, true state St+1o The same sample data concerning the N variables in yt are used also to

produce inferences on the sample path followed by Sto Although intuition will be sought after, no attempt is made to provide a

formal model of either the reason that regime changes occur or to explain the timing of such changes

o Assume the absence of roots outside the unit circle in all regimeso In the definition, the Nx1 vector μSt+1 collects the N regime-dependent

intercepts, while the p alternative NxN Aj,St+1 (j = 1, …, p) vector autoregressive matrices capture regime-dependent VAR effects

o With p VAR lags and K regimes, there are a total of pK matrices to deal with, each potentially containing (unless restrictions) N2 parameters

5Lecture 9: Markov Switching Models – Prof. Guidolin

Prob. of switchingfrom state i to state j

Structure of Markov Switching VAR Models

Page 6: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o The (lower triangular) matrix represents the factor applicable to latent state St+1 in a state-dependent Cholesky factorization of

o Conditionally on the state, MSIVARH(K, p) defines a standard Gaussian reduced form VAR(p); this is the case when we take St+1 is treated as given and observable (we shall not of course)

o In applications, K = 2 tends to be common, although not compellingo Especially with daily/weekly series common to support MSIH(K) (to be

precise, MSIH(K,0)):o p = 0 may work at all frequencies because when K 2, possible that

need of p 1 in single-state VARs arises from omission of regimeso The general model simplifies in univariate applications, when N = 1:

o For instance, consider monthly international excess stock returns for the sample 1986:01-2016:12) of US and Japanese excess equity returns (denominated in US dollars) and rate of change in the VXO index, N = 3

6Lecture 9: Markov Switching Models – Prof. Guidolin

Conditioninginformation set

Structure of Markov Switching VAR Models

Page 7: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o Analyze for each series what type of first-order MSIARH(K, p)One Application to International Equity Returns

The table for Japanese excess return is in Appendix A

Constant ExpectedReturns Model

ARMA Models

7

Page 8: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o The precise model favored by each information criterion in the case of each series may differ; in the end, simple heteroskedastic MSIH(2,0) model with no AR components always picked, for all three series, by BIC

o For the transition probs, p-values possibile but trickiero Stock markets feature typical bull (== high-risk premia and low volatility)

and bear (low or even 0, in the sense of not statistically significantly, risk premia and high volatility) phases

o Both regimes are persistent ⟹ for both US and Japan, PrSt = bull| St-1 = bull is btw. 0.97 and 0.98, and PrSt = bear| St-1 = bear btw. 0.97 and 0.99

8Lecture 9: Markov Switching Models – Prof. Guidolin

One Application to International Equity Returns

P-values

Non-persistent

Page 9: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o In the case of VXO, state-dependent means are never precisely estimated; in the second regime, the volatility of implicit volatility is almost double than in the former state

o When VIX-like volatility falls, it does so slowly and following a low variability path, while when it increases, it does so in an erratic way

o Given that all the individual series contain regimes—how many Markov states should we expect when the series are jointly modeled?

o Naïve to expect K = 2, because the univariate state probability series above are not sufficiently “synchronized”

o Their sample Spearman rank correlations are 0.46, 0.05, and 0.139Lecture 9: Markov Switching Models – Prof. Guidolin

One Application to International Equity Returns

Page 10: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o AIC and Hannan-Quinn (H-Q) converge on the choice of a rather richly parameterized MSIVARH(3,1) ⟹ saturation ratio just above threshold

o Unsurprisingly, the number of regimes equals three, an attempt to accommodate the different features of the state processes

10Lecture 9: Markov Switching Models – Prof. Guidolin

One Application to International Equity Returns

Page 11: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

One Application to International Equity Returns

11

Page 12: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o While from any of the 3 regimes it is possible to switch to any other, this admits one exception as PrSt = 1| St-1 = 2 is estimated to be 0

For large N, MSVAR models are often richly parameterized, with a total number of parameters of:o K(K – 1) is the elements that can be estimated from the transition

matrix, when by-row summing up constraints are taken into accounto For instance, for K = 2, N = 8, and p = 1 (not such an extreme case, see

e.g., Guidolin and Ono, 2006), this implies the estimation of 218 parameters ⟹ less than recommendable saturation ratios are possible

o ML estimation may pose serious numerical as well as statistical pro-blems: (i) the log-likelihood may present flat regions so that conver-gence of standard algorithms becomes impossible; (ii) identification issues may appear ⟹ numerical algorithms may get “confused”

Consider the simple case of We use sets of 1000 identical simulated shocks to better understand what a

MSIARH model can do in terms of plausibility of the resulting time serieso When possible, “calibrate” the selected parameters to US monthly data

12Lecture 9: Markov Switching Models – Prof. Guidolin

Simulating from MS Models

Page 13: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o Simulated MSI(2) yields an un-conditional mean of 6% and volatility of 19% per year as Gaussian IID

o Most observers would detect the presence of “more structure” in the rightmost vs. leftmost plot, but could not exactly detect an MSI(2)

o Some additional “variability” would be guessed, but this would be incorrect, as the 2 series are generated to have identical variance

o Key driver of the appearance of simulations is persistence of Markov chain

13Lecture 9: Markov Switching Models – Prof. Guidolin

Simulating from MS ModelsK = 1 K = 2

Simulatedregimes

Page 14: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o On the right, appearance that may remind some readers of the occurrence of frequent (negative) “jumps” in returns

Simulating from MS ModelsLeverage

14

Page 15: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o In the rightmost plot, eventually unit root is bent to stationarity by the “mixing” provided by the ergodic Markov chain

o Leftmost plot shows highly visible intercept switches, around which we then find the typical “no structure” patterns of a white noise

o In the rightmost plot, same nonlinear persistence (low), the presence of near-unit roots in each of the two regimes becomes visible and tends to cloud the fact that there are frequent regime shifts

15Lecture 9: Markov Switching Models – Prof. Guidolin

Simulating from MS Models

Page 16: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

A mixture of normal densities is a weighted sum of normal densities, in which the weights are themselves random variablesand may change over timeo In the case of MS, weights are random state probabilities over time o Mixtures of normal distributions provide a flexible family that can be

used to approximate many distributions, capturing skewness and excess kurtosis as sources of non-normality (even multi-modality)

o E.g., in an MS model, variance is not simply the average of the variances across the two regimes: differences in means also impart an effect because the switch to a new regime contributes to volatility

MS Models Are Normal Mixtures

Skewness > 0 Skewness < 0Excess kurtosis >0

16

Page 17: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

Useful to generalize framework, when specialized to N = 1, to reflect a more general form that also includes exogenous, fixed predictors and predictors whose coefficient does not follow an MS process:

This is a MS regressiono Let’s forecast monthly Japanese excess aggregate stock returns using

one lag of the same, one lag of US excess stock returns, and one lag of S&P 100 implied volatility

MS Regressions

17

Page 18: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o All criteria unanimously select a MS regression in which all coefficients are time invariant but the standard error of regression is MS

o Extensions to 3 regimes are rejected by the information criteria

o What makes the MS regression superior to a simple regression is the regime shifts in standard errors that—as we expect when heteroskeda-sticity is dealt with—allow us to obtain more precise estimates

MS models are defined as driven by a hidden, discrete Markov state that is also latent, ergodic, and irreducible

Although they can be generalized, most MS models are estimated assuming a homogeneous, first-order Markov chain

18

MS Regressions: One Example

Standard regression

2-state MS regression

Lecture 9: Markov Switching Models – Prof. Guidolin

Page 19: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

① Homogeneous, as , the prob. of transition to state j does not depend on past values of y② First-order, meaning that , or that all the memory of the past of series is retained by just one lag of Sto St is latent because it cannot be extracted from the data with perfect

precision, but at most the time series of the states may be inferred from the observed, available data

o Ergodicity ⟹ existence of a stationary Kx1 vector of probs satisfyingcalled ergodic or long-run unconditional probabilities

o All information needed to compute is in transposed transition matrixo If you start the system from a configuration of state probabilities equal

to , then your prediction for the probabilities of the regimes one-period forward is identical ⟹ MS model had reached a steady-state

o It is the entire matrix P that matters to compute the ergodic probabilities and not only the values on its main diagonal

o Given estimates of the “stayer probs”, the average estimated duration is:

19

Markov Chain Processes in MS Models

Lecture 9: Markov Switching Models – Prof. Guidolin

Page 20: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o can also be interpreted as the average, long-run time of occupation of the different regimes by the MC:

o Irreducibility of an MC implies that > 0, meaning that all regimes are possible and remain possible over time and no absorbing states or cycles among states exist

o When K = 3, the transition matrixo implies that it is impossible to

reach state 3 from the other two stateso As soon as one leaves regime 3, which will occur almost surely if p33 < 1,

it becomes impossible to ever return again to state 3o The third element of will have to be 0 aso The lecture notes show that is the eigenvector of P associated with

the unit eigenvalue; there is always a unit eigenvalue as P rows sum to 1o For instance, in the case of this P, the

eigenvalues are 1, 0.87, and 0.74 and the first eigenvector is [0.393 0.883 0.166]’

20Lecture 9: Markov Switching Models – Prof. Guidolin

Markov Chain Processes in MS Models

j-th element of

Page 21: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o This eigenvector is not yet because it fails to have unit lengtho Now sufficient to scale the eigenvector to have unit length ⟹ simply

divide its entries by their sum 1.4424, resulting in = [0.27 0.61 0.12]’ o In the special case of K = 2, one obtains explicit solutions for the ergodic

probabilities:

o In our earlier international equity return application, the estimated transition matrix is:

o As one would have expected from its persistence, the tri-variate system spends on average almost 60% of the time in the second regime

o However, in spite of their very low persistence, regimes 1 and 3 also occur on average 16% and 25% of the time; these positive rates at which they are visited are helped by the fact that regimes 1 and 3 also “communicate with each other”

o The average durations of the three regimes are 1.7, 6.3, and 1.4 months21Lecture 9: Markov Switching Models – Prof. Guidolin

Markov Chain Processes in MS Models

Page 22: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

Several types of inferences on the state St can be derived from MS

o The fact that one needs to use and to extract inferences concerning the dynamics of regimes over time (technically, concerning

) derives from the latent nature of regimes in a MS modelo The following notion is instead useful in forecasting problems

22Lecture 9: Markov Switching Models – Prof. Guidolin

Inference on the State Process in MS Models

Page 23: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o The filtered probs are the product of a limited information recursiono Once has been calculated, the lecture notes describe an

algorithm by Kim to recover the sequence of smoothed probso The difference btw. filtered and smoothed probs. is similar to asking (i) Given what I know about the weather in the past few weeks, what is chance of recording a high temperature today (also given observed conditions today)? This requires a real-time, recursive assessment, vs.(ii) Given the information on the weather in the past 12 months and up to today, what was the chance of a high temperature being recorded 4 months ago? This requires a full-information, but backward-looking assessment

23Lecture 9: Markov Switching Models – Prof. Guidolin

Inference on the State Process in MS Models

Page 24: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

o In finance, we operate in real time and focus on forecasting so that we tend to care more for filtered probabilities than for smoothed ones

o The two concepts coincide by construction at the end of the sampleo In our example concerning monthly Japanese excess stock returns:

The State Process in MS Models: One ExampleFi

ltere

d an

d sm

ooth

ed p

robs

.Pr

edic

ted

prob

s.

24

Page 25: Lecture 9: Markov Switching Models - didattica.unibocconi.itdidattica.unibocconi.it/...Markov_Switching_Models20180429011325.pdf · Overview. Lecture 9: Markov Switching Models –

Appendix A: Model Selection for Japanese Data

25Lecture 9: Markov Switching Models – Prof. Guidolin