State Space Time Series Analysis · 2011-02-01 · Kalman Filter • The Kalman ﬁlter calculates the mean and variance of the unobserved state, given the observations. • The state

State Space Time Series Analysis

Siem Jan Koopman

http://staff.feweb.vu.nl/koopman

Department of Econometrics

VU University Amsterdam

Tinbergen Institute

2011

State Space Time Series Analysis – p. 1

Classical Decomposition

A basic model for representing a time series is the additive model

yt = µt + γt + εt, t = 1, . . . , n,

also known as the Classical Decomposition.

yt = observation,

µt = slowly changing component (trend),

γt = periodic component (seasonal),

εt = irregular component (disturbance).

In a Structural Time Series Model (STSM) or Unobserved ComponentsModel (UCM), the RHS components are modelled explicitly asstochastic processes.


Local Level Model

• Components can be deterministic functions of time (e.g.polynomials), or stochastic processes;

• Deterministic example: yt = µ+ εt with εt ∼ NID(0, σ2ε).

• Stochastic example: the Random Walk plus Noise, orLocal Level model:

yt = µt + εt, εt ∼ NID(0, σ2ε)

µt+1 = µt + ηt, ηt ∼ NID(0, σ2η),

• The disturbances εt, ηs are independent for all s, t;• The model is incomplete without a specification for µ1 (note the

non-stationarity):µ1 ∼ N (a, P )


Local Level Model

yt = µt + εt, εt ∼ NID(0, σ2ε)

µt+1 = µt + ηt, ηt ∼ NID(0, σ2η),

µ1 ∼ N (a, P )

• The level µt and the irregular εt are unobserved;

• Parameters: σ2ε , σ

2η;

• Trivial special cases:◦ σ2

η = 0 =⇒ yt ∼ NID(µ1, σ2ε) (WN with constant level);

◦ σ2ε = 0 =⇒ yt+1 = yt + ηt (pure RW);

• Local Level is a model representation for EWMA forecasting.


Local Linear Trend Model

The LLT model extends the LL model with a slope:

yt = µt + εt, εt ∼ NID(0, σ2ε),

µt+1 = βt + µt + ηt, ηt ∼ NID(0, σ2η),

βt+1 = βt + ξt, ξt ∼ NID(0, σ2ξ ).

• All disturbances are independent at all lags and leads;• Initial distributions β1, µ1 need to specified;

• If σ2ξ = 0 the trend is a random walk with constant drift β1; (For

β1 = 0 the model reduces to a LL model.)

• If additionally σ2η = 0 the trend is a straight line with slope β1 and

intercept µ1;

• If σ2ξ > 0 but σ2

η = 0, the trend is a smooth curve, or an IntegratedRandom Walk;


Trend and Slope in LLT Model

0 10 20 30 40 50 60 70 80 90 100

−2.5

0.0

2.5

5.0µ

0 10 20 30 40 50 60 70 80 90 100

−0.25

0.00

0.25

0.50

0.75 β


Trend and Slope in Integrated Random Walk Model

0 10 20 30 40 50 60 70 80 90 100

0

5

10 µ

0 10 20 30 40 50 60 70 80 90 100

−0.25

0.00

0.25

0.50

0.75 β


Local Linear Trend Model

• Reduced form of LLT is ARIMA(0,2,2);• LLT provides a model for Holt-Winters forecasting;• Smooth LLT provides a model for spline-fitting;• Smoother trends: higher order Random Walks

∆dµt = ηt


Seasonal Effects

We have seen specifications for µt in the basic model

yt = µt + γt + εt.

Now we will consider the seasonal term γt. Let s denote the number of‘seasons’ in the data:

• s = 12 for monthly data,• s = 4 for quarterly data,• s = 7 for daily data when modelling a weekly pattern.


Dummy Seasonal

The simplest way to model seasonal effects is by using dummyvariables. The effect summed over the seasons should equal zero:

γt+1 = −s−1∑

j=1

γt+1−j.

To allow the pattern to change over time, we introduce a newdisturbance term:

γt+1 = −s−1∑

j=1

γt+1−j + ωt, ωt ∼ NID(0, σ2ω).

The expectation of the sum of the seasonal effects is zero.


Trigonometric Seasonal

Defining γjt as the effect of season j at time t, an alternativespecification for the seasonal pattern is

γt =

[s/2]∑

j=1

γjt,

γj,t+1 = γjt cosλj + γ∗jt sinλj + ωjt,

γ∗j,t+1 = −γjt sinλj + γ∗jt cosλj + ω∗

jt,

ωjt, ω∗

jt ∼ NID(0, σ2ω), λj = 2πj/s.

• Without the disturbance, the trigonometric specification is identicalto the deterministic dummy specification.

• The autocorrelation in the trigonometric specification lasts throughmore lags: changes occur in a smoother way;


Seatbelt Law

70 75 80 85

7.0

7.1

7.2

7.3

7.4

7.5

7.6

7.7

7.8

7.9


State Space Model

Linear Gaussian state space model (LGSSM) is defined in three parts:

→ State equation:

αt+1 = Ttαt +Rtζt, ζt ∼ NID(0, Qt),

→ Observation equation:

yt = Ztαt + εt, εt ∼ NID(0, Ht),

→ Initial state distribution α1 ∼ N (a1, P1).

Notice that• ζt and εs independent for all t, s, and independent from α1;• observation yt can be multivariate;• state vector αt is unobserved;• matrices Tt, Zt, Rt, Qt, Ht determine structure of model.


State Space Model

• state space model is linear and Gaussian: therefore propertiesand results of multivariate normal distribution apply;

• state vector αt evolves as a VAR(1) process;• system matrices usually contain unknown parameters;• estimation has therefore two aspects:

◦ measuring the unobservable state (prediction, filtering andsmoothing);

◦ estimation of unknown parameters (maximum likelihoodestimation);

• state space methods offer a unified approach to a wide range ofmodels and techniques: dynamic regression, ARIMA, UC models,latent variable models, spline-fitting and many ad-hoc filters;

• next, some well-known model specifications in state space form ...


Regression with Time Varying Coefficients

General state space model:


yt = Ztαt + εt, εt ∼ NID(0, Ht).

Put regressors in Zt,Tt = I, Rt = I,

Result is regression model with coefficient αt following a random walk.


ARMA in State Space Form

Example: AR(2) model yt+1 = φ1yt + φ2yt−1 + ζt, in state space:



with 2× 1 state vector αt and system matrices:

Zt =[

1 0]

, Ht = 0

Tt =

[

φ1 1

φ2 0

]

, Rt =

[

1

0

]

, Qt = σ2

• Zt and Ht = 0 imply that α1t = yt;• First state equation implies yt+1 = φ1yt + α2t + ζt withζt ∼ NID(0, σ2);

• Second state equation implies α2,t+1 = φ2yt;



Example: MA(1) model yt+1 = ζt + θζt−1, in state space:



with 2× 1 state vector αt and system matrices:

Zt =[

1 0]

, Ht = 0

Tt =

[

0 1

0 0

]

, Rt =

[

1

θ

]

, Qt = σ2

• Zt and Ht = 0 imply that α1t = yt;

• First state equation implies yt+1 = α2t + ζt with ζt ∼ NID(0, σ2);

• Second state equation implies α2,t+1 = θζt;



Example: ARMA(2,1) model

yt = φ1yt−1 + φ2yt−2 + ζt + θζt−1

in state space form

αt =

[

yt

φ2yt−1 + θζt

]

Zt =[

1 0]

, Ht = 0,

Tt =

[

φ1 1

φ2 0

]

, Rt =

[

1

θ

]

, Qt = σ2

All ARIMA(p, d, q) models have a (non-unique) state spacerepresentation.


UC models in State Space Form

State space model: αt+1 = Ttαt +Rtζt, yt = Ztαt + εt.

LL model ∆µt+1 = ηt and yt = µt + εt:

αt = µt, Tt = 1, Rt = 1, Qt = σ2η,

Zt = 1, Ht = σ2ε .

LLT model ∆µt+1 = βt + ηt, ∆βt+1 = ξt and yt = µt + εt:

αt =

[

µt

βt

]

, Tt =

[

1 1

0 1

]

, Rt =

[

1 0

0 1

]

, Qt =

[

σ2η 0

0 σ2ξ

]

,

Zt =[

1 0]

, Ht = σ2ε .


UC models in State Space Form

State space model: αt+1 = Ttαt + Rtζt, yt = Ztαt + εt.

LLT model with season: ∆µt+1 = βt + ηt, ∆βt+1 = ξt,S(L)γt+1 = ωt and yt = µt + γt + εt:

αt =[

µt βt γt γt−1 γt−2

]

′

,

Tt =

1 1 0 0 0

0 1 0 0 0

0 0 −1 −1 −1

0 0 1 0 0

0 0 0 1 0

, Qt =

σ2η 0 0

0 σ2ξ 0

0 0 σ2ω

, Rt =

1 0 0

0 1 0

0 0 1

0 0 0

0 0 0

,

Zt =[

1 0 1 0 0]

, Ht = σ2ε .


Kalman Filter

• The Kalman filter calculates the mean and variance of theunobserved state, given the observations.

• The state is Gaussian: the complete distribution is characterizedby the mean and variance.

• The filter is a recursive algorithm; the current best estimate isupdated whenever a new observation is obtained.

• To start the recursion, we need a1 and P1, which we assumedgiven.

• There are various ways to initialize when a1 and P1 are unknown,which we will not discuss here.


Kalman Filter

The unobserved state αt can be estimated from the observations withthe Kalman filter :

vt = yt − Ztat,

Ft = ZtPtZ′

t +Ht,

Kt = TtPtZ′

tF−1t ,

at+1 = Ttat +Ktvt,

Pt+1 = TtPtT′

t +RtQtR′

t −KtFtK′

t,

for t = 1, . . . , n and starting with given values for a1 and P1.

• Writing Yt = {y1, . . . , yt},

at+1 = E(αt+1|Yt), Pt+1 = var(αt+1|Yt).


Kalman Filter

State space model: αt+1 = Ttαt +Rtζt, yt = Ztαt + εt.

• Writing Yt = {y1, . . . , yt}, define

at+1 = E(αt+1|Yt), Pt+1 = var(αt+1|Yt);

• The prediction error is

vt = yt − E(yt|Yt−1)

= yt − E(Ztαt + εt|Yt−1)

= yt − Zt E(αt|Yt−1)

= yt − Ztat;

• It follows that vt = Zt(αt − at) + εt and E(vt) = 0;

• The prediction error variance is Ft = var(vt) = ZtPtZ′

t +Ht.


Lemma

The proof of the Kalman filter uses a lemma from multivariate Normalregression theory.

Lemma Suppose x, y and z are jointly Normally distributed vectorswith E(z) = 0 and Σyz = 0. Then

E(x|y, z) = E(x|y) + ΣxzΣ−1zz z,

var(x|y, z) = var(x|y)− ΣxzΣ−1zz Σ

′

xz ,


Multivariate local level model

Seemingly Unrelated Time Series Equations model:

yt = µt + εt, εt ∼ NID(0,Σε),

µt+1 = µt + ηt, ηt ∼ NID(0,Ση).

• Observations are p× 1 vectors;• The disturbances εt, ηs are independent for all s, t;• The p different time series are related through correlations in the

disturbances.

For a full discussion, see Harvey and Koopman (1997).

A pdf version (scanned) at http://staff.feweb.vu.nl/koopmanunder section “Publications” and subsection “Published articles ascontributions to books”.


Multivariate LL Model

The multivariate LL model is given by


µt+1 = µt + ηt, ηt ∼ NID(0,Ση).

• First difference∆yt = ηt−1 +∆εt

is stationary;• Reduced form: ∆yt is VMA(1) or VAR(∞);


Multivariate LL Model

• Stochastic properties are multivariate analogous of univariatecase:

Γ0 = E(∆yt ∆y′t) = Ση + 2Σε

Γ1 = E(∆yt ∆y′t−1) = −Σε

Γτ = E(∆yt ∆y′t−τ ) = 0, τ ≥ 2,

• The unrestricted vector MA(1) process has p2 + p(p+ 1)/2parameters, the SUTSE has p× (p+ 1);

• Such multivariate reduced form representations can also beestablished for general models.


Homogeneous Multivariate LL Model

The homogeneous multivariate LL model is given by


µt+1 = µt + ηt, ηt ∼ NID(0, qΣε),

where q is a non-negative scalar. This implies that Ση = qΣε.

• The model is restricted, all series in yt have the same dynamicproperties (the same acf).

• Not so relevant in practical work apart from forecasting. It is themodel representation for exponentially weighted moving average(EWMA) forecasting of multiple time series.

• This can be generalised to more general components models.• Easy to estimate, only a set of univariate Kalman filters are

required.


Common Levels

The common local level model is given by


µt+1 = µt + ηt, ηt ∼ NID(0,Ση),

where rank(Ση) = r < p.

• the model can be described by r underlying level components, thecommon levels;

•

Ση = AΣcA′,

A is p× r, Σc is r × r of full rank;• interpretation of A: factor loading matrix.


Common Levels

The common local level model


µt+1 = µt + ηt, ηt ∼ NID(0, AΣcA′),

can be rewritten in terms of underlying levels:

yt = a+Aµct + εt,

µct+1 = µc

t + ηct , ηct ∼ NID(0,Σc),

so thatµt = a+Aµc

t , ηt = Aηct .


Common Levels

For the common local level model


µt+1 = µt + ηt, ηt ∼ NID(0, AΣcA′),

notice that• decomposition Ση = AΣcA

′ is not unique;

• identification restrictions: Σc is diagonal, Choleski decomposition,principal compoments (based on economic theory);

• more interesting interpretation can be obtained by factor rotations;• can be interpreted as dynamic factor analysis, see later.


Common components

Common dynamic factors:• are useful for interpretation → cointegration;• have consequence for inference and forecasting (dimension of

parameter space reduces as a result).• common local level model can be generally represented as a

VAR(∞) or VECM models, details can be provided upon request.


Multivariate components

• So far, we have concentrated on multivariate variants of the locallevel model;

• Similar considerations can be applied to other components suchas the slope of the trend, seasonal and cycle components andother time-varying features in the multiple time series.

• Harvey and Koopman (1997) review such extensions.• In particular, they define the similar cycle component, see

Exercises.


Common and idiosyncratic factors

Multiple trends can also be decomposed into a one common factor andmultiple idiosyncratic factors:


µt+1 = µt + ηt, ηt ∼ NID(0,Ση),

where Ση = δδ′ +Dη with vector δ and diagonal matrix Dη. Thisimplies that the level can be represented by

µt = δµct + µ∗

t , ηt = δηct + η∗t

with common level (scalar) µct and ”independent” level µ∗

it generated by

∆µct+1 = ηct ∼ NID(0, 1), ∆µ∗

t+1 = η∗t ∼ NID(0, Dη).


Mulitvariate Kalman filter

The Kalman filter is valid for the general multivariate state spacemodel.

Computationally it is not convenient when p becomes large, verylarge.

Each step of the Kalman filter requires the inversion of the p× pmatrix Ft. This is no problem when p = 1 (univariate) but whenp > 20, say, it will slow down the Kalman filter considerably.

However, we can treat each element in the p× 1 observation vector ytas a single realisation. In other words, we can "update" eachsingle element of yt within the Kalman filter.

The arguments are given in DK book §6.4.

The same applies to smoothing.


Univariate treatment of Kalman filter

• Consider standard model: yt = Ztαt + εt and αt+1 = Ttαt +Rtηtwhere Var(εt) = Ht is diagonal.

• Observation vector yt = (yt,1, . . . , yt,pt)′ is treated and we view

observation model as a set of pt separate equations.• We then have, yt,i = Zt,iαt,i + εt,i with αt,i = αt for i = 1, . . . , pt.

• The associated transition equations become αt,i+1 = αt,i fori = 1, . . . , pt and αt+1,1 = Ttαt,pt

+Rtηt for t = 1, . . . , n.

• This disentangled model can be treated by the Kalman filter andsmoother equations straightforwardly.

• Innovations are now relative to the past and the “previous”observations inside yt,pt

!

• Non-diagonal matrix Ht can be treated by data-transformation orby including εt in the state vector αt.

• More details in DK book §6.4.


Exercise 1

Consider the common trends model of Harvey and Koopman (1997,§§9.4.1 and 9.4.2).

1. Put the common trends model with (possibly common) stochasticslopes and based on equations (21)-(23) in state space form.

2. Put the common trends model with (possibly common) stochasticslopes and based on equations (24)-(26) in state space form.Define all vectors and matrices precisely.

3. Discuss the generalisation of Ση 6= 0 and the consequences forthe state space formulation of the model as in 2.


Exercise 2

Consider the multivariate trend model of Harvey and Koopman (1997).

1. Consider a multiple data set of N time series yt. The aim is todecompose the time series into trend and stationary components.It is further required that the multiple trend can be decomposedinto a common single trend (common to all N time series) andidiosyncratic trends (specific to the individual time series).

• Formulate a model for such a decomposition.• Discuss the identification of the different trends.• Express the model in state space form.

2. Once multiple trend models are expressed in state space form,we need to estimate the parameter coefficients of the model.Please describe shortly some relevant issues of maximumlikelihood estimation. Is it feasible ? What problems can youexpect ? Any recommendations for a successful implementation ?


Exercise 3

Consider the similar cycle model of Harvey and Koopman (1997) withobservation equation

yt = ψt + εt, εt ∼ N(0,Σε),

where yt is a 3× 1 observation vector. Cycle ψt represents a commonsimilar cycle component of rank 2.

1. Please provide the state space representation of this model.

2. Comment on the restrictive nature of the similar cycle model.

3. How would you modify the similar cycle model so that each timeseries in yt has a different cycle frequency λ.

4. Can you apply the univariate Kalman filter of DK §6.4 in case Σε

is diagonal ? What if Σε is not diagonal ? Give details.


Documents

State Space Time Series Analysis · 2011-02-01 · Kalman Filter • The Kalman ﬁlter calculates the mean and variance of the unobserved state, given the observations. • The state