Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Lecture on Parameter Estimation forStochastic Differential Equations
Erik Lindström
FMS161/MASM18 Financial Statistics
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Recap
I We are interested in the parameters θ in the StochasticIntegral Equations
X (t) = X (0) +∫ t
0µθ (s,X (s))ds +
∫ t
0σθ (s,X (s))dW (s) (1)
Why?I Model validationI Risk managementI Advanced hedging (Greeks 9.2.2 and quadratic hedging
9.2.2.1 (P/Q))
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Recap
I We are interested in the parameters θ in the StochasticIntegral Equations
X (t) = X (0) +∫ t
0µθ (s,X (s))ds +
∫ t
0σθ (s,X (s))dW (s) (1)
Why?I Model validationI Risk managementI Advanced hedging (Greeks 9.2.2 and quadratic hedging
9.2.2.1 (P/Q))
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Recap
I We are interested in the parameters θ in the StochasticIntegral Equations
X (t) = X (0) +∫ t
0µθ (s,X (s))ds +
∫ t
0σθ (s,X (s))dW (s) (1)
Why?I Model validationI Risk managementI Advanced hedging (Greeks 9.2.2 and quadratic hedging
9.2.2.1 (P/Q))
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Recap
I We are interested in the parameters θ in the StochasticIntegral Equations
X (t) = X (0) +∫ t
0µθ (s,X (s))ds +
∫ t
0σθ (s,X (s))dW (s) (1)
Why?I Model validationI Risk managementI Advanced hedging (Greeks 9.2.2 and quadratic hedging
9.2.2.1 (P/Q))
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Some asymptotics
Consider the arithmetic Brownian motion
dX (t) = µdt + σdW (t) (2)
The drift is estimated by computing the mean, andcompensating for the sampling δ = tn+1− tn
µ =1
δN
N−1
∑n=0
X (tn+1)−X (tn). (3)
Expanding this expression reveals that the MLE is given by
µ =X (tN)−X (t0)
tN − t0= µ + σ
W (tN)−W (t0)
tN − t0. (4)
The MLE for the diffusion (σ ) parameter is given by
σ2 =
1δ (N−1)
N−1
∑n=0
(X (tn+1)−X (tn)− µδ )2 d→ σ2 χ2(N−1)
N−1(5)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Some asymptotics
Consider the arithmetic Brownian motion
dX (t) = µdt + σdW (t) (2)
The drift is estimated by computing the mean, andcompensating for the sampling δ = tn+1− tn
µ =1
δN
N−1
∑n=0
X (tn+1)−X (tn). (3)
Expanding this expression reveals that the MLE is given by
µ =X (tN)−X (t0)
tN − t0= µ + σ
W (tN)−W (t0)
tN − t0. (4)
The MLE for the diffusion (σ ) parameter is given by
σ2 =
1δ (N−1)
N−1
∑n=0
(X (tn+1)−X (tn)− µδ )2 d→ σ2 χ2(N−1)
N−1(5)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Some asymptotics
Consider the arithmetic Brownian motion
dX (t) = µdt + σdW (t) (2)
The drift is estimated by computing the mean, andcompensating for the sampling δ = tn+1− tn
µ =1
δN
N−1
∑n=0
X (tn+1)−X (tn). (3)
Expanding this expression reveals that the MLE is given by
µ =X (tN)−X (t0)
tN − t0= µ + σ
W (tN)−W (t0)
tN − t0. (4)
The MLE for the diffusion (σ ) parameter is given by
σ2 =
1δ (N−1)
N−1
∑n=0
(X (tn+1)−X (tn)− µδ )2 d→ σ2 χ2(N−1)
N−1(5)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Some asymptotics
Consider the arithmetic Brownian motion
dX (t) = µdt + σdW (t) (2)
The drift is estimated by computing the mean, andcompensating for the sampling δ = tn+1− tn
µ =1
δN
N−1
∑n=0
X (tn+1)−X (tn). (3)
Expanding this expression reveals that the MLE is given by
µ =X (tN)−X (t0)
tN − t0= µ + σ
W (tN)−W (t0)
tN − t0. (4)
The MLE for the diffusion (σ ) parameter is given by
σ2 =
1δ (N−1)
N−1
∑n=0
(X (tn+1)−X (tn)− µδ )2 d→ σ2 χ2(N−1)
N−1(5)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
A simple method
Many data sets are sampled at high frequency, making the biasdue to discretization of the SDEs some of the schemes inChapter 12 acceptable.The simplest discretization, the Explicit Euler method, would forthe stochastic differential equation
dX (t) = µ(t ,X (t))dt + σ(t ,X (t))dW (t) (6)
correspond the Discretized Maximum Likelihood (DML)estimator given by
θDML = argmaxθ∈Θ
N−1
∑n=1
logφ (X (tn+1),X (tn) + µ(tn,X (tn))∆,Σ(tn,X (tn))∆)
(7)where φ(x ,m,P) is the density for a multivariate Normaldistribution with argument x , mean m and covariance P and
Σ(t ,X (t)) = σ(t ,X (t))σ(t ,X (t))T . (8)Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
A simple method
Many data sets are sampled at high frequency, making the biasdue to discretization of the SDEs some of the schemes inChapter 12 acceptable.The simplest discretization, the Explicit Euler method, would forthe stochastic differential equation
dX (t) = µ(t ,X (t))dt + σ(t ,X (t))dW (t) (6)
correspond the Discretized Maximum Likelihood (DML)estimator given by
θDML = argmaxθ∈Θ
N−1
∑n=1
logφ (X (tn+1),X (tn) + µ(tn,X (tn))∆,Σ(tn,X (tn))∆)
(7)where φ(x ,m,P) is the density for a multivariate Normaldistribution with argument x , mean m and covariance P and
Σ(t ,X (t)) = σ(t ,X (t))σ(t ,X (t))T . (8)Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Consistency
I The DMLE is generally NOT consistent.I Approximate ML estimators (13.5) are, provided enough
computational resources are allocatedI Simulation based estimatorsI Fokker-Planck based estimatorsI Series expansions.
I GMM-type estimators (13.6) are consistent if the momentsare correctly specified (which is a non-trivial problem!)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Consistency
I The DMLE is generally NOT consistent.I Approximate ML estimators (13.5) are, provided enough
computational resources are allocatedI Simulation based estimatorsI Fokker-Planck based estimatorsI Series expansions.
I GMM-type estimators (13.6) are consistent if the momentsare correctly specified (which is a non-trivial problem!)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Consistency
I The DMLE is generally NOT consistent.I Approximate ML estimators (13.5) are, provided enough
computational resources are allocatedI Simulation based estimatorsI Fokker-Planck based estimatorsI Series expansions.
I GMM-type estimators (13.6) are consistent if the momentsare correctly specified (which is a non-trivial problem!)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Simultion based estimators
I Discretely observed SDEs are Markov processesI Then it follows that
pθ (xt |xs) = Eθ [pθ (xt |xτ )|F (s)] , t > τ > s (9)
This is the Pedersen algorithm.I Improved by Durham-Gallant (2002) and Lindström (2012)I Works very well for Multivariate models!I and is easily (...) extended to Levy driven SDEs.
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Simultion based estimators
I Discretely observed SDEs are Markov processesI Then it follows that
pθ (xt |xs) = Eθ [pθ (xt |xτ )|F (s)] , t > τ > s (9)
This is the Pedersen algorithm.I Improved by Durham-Gallant (2002) and Lindström (2012)I Works very well for Multivariate models!I and is easily (...) extended to Levy driven SDEs.
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Simultion based estimators
I Discretely observed SDEs are Markov processesI Then it follows that
pθ (xt |xs) = Eθ [pθ (xt |xτ )|F (s)] , t > τ > s (9)
This is the Pedersen algorithm.I Improved by Durham-Gallant (2002) and Lindström (2012)I Works very well for Multivariate models!I and is easily (...) extended to Levy driven SDEs.
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Simultion based estimators
I Discretely observed SDEs are Markov processesI Then it follows that
pθ (xt |xs) = Eθ [pθ (xt |xτ )|F (s)] , t > τ > s (9)
This is the Pedersen algorithm.I Improved by Durham-Gallant (2002) and Lindström (2012)I Works very well for Multivariate models!I and is easily (...) extended to Levy driven SDEs.
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Some key points
I Naive implementation only provides a point wise estimate -use CRNs or importance sampling
I Variance reduction helps (antithetic variates, controlvariates)
I The near optimal importance sampler is a Bridge process,as it reduces variance AND improves the asymptotics.
I There is a version that is completely bias free, albeitsomewhat restrictive in terms of the class of feasiblemodels.
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Fokker-Planck
Consider the expectation
E [h(x(t))|F (0)] =∫
h(x(t))p(x(t)|x(0))dx(t) (10)
and then∂
∂ tE [h(x(t))|F (0)] (11)
Two possible ways to compute this, direct and using theIto formula. Equating these yields
∂p∂ t
(x(t)|x(0)) = A ?p(x(t)|x(0)) (12)
where
A ?p(x(t)) =− ∂
∂x(t)(µ(·)p(x(t))) +
12
∂ 2
∂x2(t)
(σ
2(·)p(x(t))).
(13)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Fokker-Planck
Consider the expectation
E [h(x(t))|F (0)] =∫
h(x(t))p(x(t)|x(0))dx(t) (10)
and then∂
∂ tE [h(x(t))|F (0)] (11)
Two possible ways to compute this, direct and using theIto formula. Equating these yields
∂p∂ t
(x(t)|x(0)) = A ?p(x(t)|x(0)) (12)
where
A ?p(x(t)) =− ∂
∂x(t)(µ(·)p(x(t))) +
12
∂ 2
∂x2(t)
(σ
2(·)p(x(t))).
(13)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Fokker-Planck
Consider the expectation
E [h(x(t))|F (0)] =∫
h(x(t))p(x(t)|x(0))dx(t) (10)
and then∂
∂ tE [h(x(t))|F (0)] (11)
Two possible ways to compute this, direct and using theIto formula. Equating these yields
∂p∂ t
(x(t)|x(0)) = A ?p(x(t)|x(0)) (12)
where
A ?p(x(t)) =− ∂
∂x(t)(µ(·)p(x(t))) +
12
∂ 2
∂x2(t)
(σ
2(·)p(x(t))).
(13)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Example of the Fokker-Planck equation
From (Lindström, 2007)
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.05
0.055
0.06
0.065
0.07
0.075
0.08
0
100
200
300
400
TimeGrid
p(s,
x s;t,x t)
Figure: Fokker-Planck equation computed for the CKLS process
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Comments on the PDE approach
Generally better than the Monte Carlo method in lowdimensional problems.
102
103
10−5
10−4
10−3
10−2
10−1
MA
E
time
Durham−GallantPoulsenOrder2−Pade(1,1)Order4−Pade(2,2)
Figure: Comparing Monte Carlo, 2nd order and 4th order numericalapproximations of the Fokker-Planck equation
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Discussion
I Fokker-Planck is the preferred method is the state space isnon-trivial (see Pedersen et. al, 2011)
I Successfully used in 1-d and 2-d problemsI but the “curse of dimensionality” will eventually make the
method infeasable
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Discussion
I Fokker-Planck is the preferred method is the state space isnon-trivial (see Pedersen et. al, 2011)
I Successfully used in 1-d and 2-d problemsI but the “curse of dimensionality” will eventually make the
method infeasable
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Discussion
I Fokker-Planck is the preferred method is the state space isnon-trivial (see Pedersen et. al, 2011)
I Successfully used in 1-d and 2-d problemsI but the “curse of dimensionality” will eventually make the
method infeasable
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Series expansion
I The solution to the Fokker-Planck equation when
dX (t) = µdt + σdW (t) (14)
is p(x(t)|x(0)) = N(x(t);x(0) + µt ,σ2t).I Hermite polynomials are the orthogonal polynomial basis
when using a Gaussian as weight function.I This is used in the ’series expansion approach’, see e.g.
(Ait-Sahalia, 2002, 2008)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Series expansion
I The solution to the Fokker-Planck equation when
dX (t) = µdt + σdW (t) (14)
is p(x(t)|x(0)) = N(x(t);x(0) + µt ,σ2t).I Hermite polynomials are the orthogonal polynomial basis
when using a Gaussian as weight function.I This is used in the ’series expansion approach’, see e.g.
(Ait-Sahalia, 2002, 2008)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Series expansion
I The solution to the Fokker-Planck equation when
dX (t) = µdt + σdW (t) (14)
is p(x(t)|x(0)) = N(x(t);x(0) + µt ,σ2t).I Hermite polynomials are the orthogonal polynomial basis
when using a Gaussian as weight function.I This is used in the ’series expansion approach’, see e.g.
(Ait-Sahalia, 2002, 2008)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Key ideas
Transform from X → Y → Z where Z is approximately standardGaussian. We assume that
dX (t) = µ(X (t))dt + σ(X (t))dW (t) (15)
First step.
Y (t) =∫ du
σ(u)(16)
It then follows that
dY (t) = µY (Y (t))dt + dW (t) (17)
Second step: Transform
Z (tk ) =Y (tk )−Y (tk−1)
tk − tk−1. (18)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Key ideas
Transform from X → Y → Z where Z is approximately standardGaussian. We assume that
dX (t) = µ(X (t))dt + σ(X (t))dW (t) (15)
First step.
Y (t) =∫ du
σ(u)(16)
It then follows that
dY (t) = µY (Y (t))dt + dW (t) (17)
Second step: Transform
Z (tk ) =Y (tk )−Y (tk−1)
tk − tk−1. (18)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Key ideas
Transform from X → Y → Z where Z is approximately standardGaussian. We assume that
dX (t) = µ(X (t))dt + σ(X (t))dW (t) (15)
First step.
Y (t) =∫ du
σ(u)(16)
It then follows that
dY (t) = µY (Y (t))dt + dW (t) (17)
Second step: Transform
Z (tk ) =Y (tk )−Y (tk−1)
tk − tk−1. (18)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Key ideas
Transform from X → Y → Z where Z is approximately standardGaussian. We assume that
dX (t) = µ(X (t))dt + σ(X (t))dW (t) (15)
First step.
Y (t) =∫ du
σ(u)(16)
It then follows that
dY (t) = µY (Y (t))dt + dW (t) (17)
Second step: Transform
Z (tk ) =Y (tk )−Y (tk−1)
tk − tk−1. (18)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Expansion
A Hermite expansion for the density pZ at order J is given by
pJZ (z|y(0), tk − tk−1) = φ(z)
J
∑j=0
ηj(tk − tk−1,y0)Hj(z) (19)
where
Hj(z) = ez2/2 dj
dz j e−z2/2. (20)
The coefficients are computed by projecting the density ontothe basis functions Hj(z) (recall Hilbert space theory)
ηj(t ,y0) =1j!
∫Hj(z)pJ
Z (z|y(0), tk − tk−1)dz (21)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Expansion
A Hermite expansion for the density pZ at order J is given by
pJZ (z|y(0), tk − tk−1) = φ(z)
J
∑j=0
ηj(tk − tk−1,y0)Hj(z) (19)
where
Hj(z) = ez2/2 dj
dz j e−z2/2. (20)
The coefficients are computed by projecting the density ontothe basis functions Hj(z) (recall Hilbert space theory)
ηj(t ,y0) =1j!
∫Hj(z)pJ
Z (z|y(0), tk − tk−1)dz (21)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Practical concerns
I The series expansion can be extremely accurate.I The standard approach is to compute ηj by Taylor
expansion up to order (tk − tk−1)K = (∆t)K
I Some restrictions ’so-called reducible diffusion’ when usingthe method for multivariate diffusions.
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Other alternatives - GMM/EF
What about non-likelihood methods?I The model is govern by some p-dimensional parameter.I Suppose some set of features are important,
hl(x), l = 1, . . . ,q ≥ pI Compute
f (x(t);θ) =
h1(x(t))−Eθ [h1(Xt )]. . .
hq(xt )−Eθ [hq(Xt )]
(22)
and formI
JN(θ) =
(1N
N
∑n=1
f (x(n);θ)
)T
W
(1N
N
∑n=1
f (x(n);θ)
)(23)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Other alternatives - GMM/EF
What about non-likelihood methods?I The model is govern by some p-dimensional parameter.I Suppose some set of features are important,
hl(x), l = 1, . . . ,q ≥ pI Compute
f (x(t);θ) =
h1(x(t))−Eθ [h1(Xt )]. . .
hq(xt )−Eθ [hq(Xt )]
(22)
and formI
JN(θ) =
(1N
N
∑n=1
f (x(n);θ)
)T
W
(1N
N
∑n=1
f (x(n);θ)
)(23)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
Other alternatives - GMM/EF
What about non-likelihood methods?I The model is govern by some p-dimensional parameter.I Suppose some set of features are important,
hl(x), l = 1, . . . ,q ≥ pI Compute
f (x(t);θ) =
h1(x(t))−Eθ [h1(Xt )]. . .
hq(xt )−Eθ [hq(Xt )]
(22)
and formI
JN(θ) =
(1N
N
∑n=1
f (x(n);θ)
)T
W
(1N
N
∑n=1
f (x(n);θ)
)(23)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
GMM
The Generalized Methods of Moments (GMM) estimators isthen given by
θ = argminJN(θ) (24)
It can be shown that√
N(
θN −θ0
)→ N(0,Σ) (25)
whereΣ =
(ΓT
N Ω−1N ΓN
)−1. (26)
and ΓN and ΩN are estimates of
Γ = E[(
∂ f(x,θ)
∂θ T
)], Ω = Var [f(x,θ)] . (27)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations
GMM
The Generalized Methods of Moments (GMM) estimators isthen given by
θ = argminJN(θ) (24)
It can be shown that√
N(
θN −θ0
)→ N(0,Σ) (25)
whereΣ =
(ΓT
N Ω−1N ΓN
)−1. (26)
and ΓN and ΩN are estimates of
Γ = E[(
∂ f(x,θ)
∂θ T
)], Ω = Var [f(x,θ)] . (27)
Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations