Upload
lamkhuong
View
216
Download
3
Embed Size (px)
Citation preview
Mike Dowd Dept of Mathematics & Statistics (and Dept of Oceanography)
Dalhousie University, Halifax, Canada
Introduction to Particle Filters for Data Assimilation
STATMOS Summer School in Data Assimila5on, Boulder, Colorado, May 18-‐25 2015
Problem Statement
• Noisy 'me series observa'ons on the system state
• a discrete 'me stochas'c dynamic model describing the 'me evolu'on of the system state
y1:T = (y1,..., yT )
xt = θxt−1 + n t
• We want to es'mate the system state (and some'mes parameters)
e.g. Bivariate AR(1) process
Noisy Observations of System State
0 50 100 150 200
-10
-50
510
time
state
Observations
Stochastic Dynamics governing System State
0 50 100 150 200
-10
-50
510
10 Realizations of System State
time
state
Estimation of System State by Particle Filters
0 50 100 150 200
-10
-50
510
time
state
Particle Filter Estimate of the State
Target: Evolution of the Probability Distribution Of the State through Time
state
'me
�Hierarchical Bayesian Model
p(x1:T ,θ | y1:T )∝ p(y1:T | x1:T ,θ ) ⋅ p(x1:T |θ ) ⋅ p(θ )
A General Framework for DA
Special Cases in DA
• Parameter es'ma'on (no model error)
• Filtering solu'on for state *
• Smoothing solu'on for state
• Joint state & parameter es'ma'on
• Inference on dynamic model structure
target (posterior) measurement dynamics prior
• The general goal is to make inferences about the state xt and some'mes the sta'c parameters* θ and ϕ**.
• We use explicit priors on process noise & observa'on errors (i.e. parametric distribu'ons for et and vt). Assume yt condi'onally independent given xt, and that et and vt are independent.
• For DA, we want to do space-‐'me problems: space is incorporated
in xt
è Solu'ons involve sequen'al Monte Carlo based for nonlinear and nonGaussian state space models
xt = d(xt−1,θ ,et )yt = h(xt ,φ,ν t )
� State Space Model: xt ~ p(xt ,θ | xt−1)yt | xt ~ p(yt ,φ | xt )
or
Data Assimilation: A Statistical Framework
* Dynamic parameters are part of the state
t =1,…,T
** drop φ hereaTer, and defer θ
Problem Classes We want to es'mate the system state at 'me t, or given the observa'ons up to 'me s, or
xty1:S = (y1, y2,.., yS )
data available
no data available
estimation time
A) Filtering (Nowcast)
Time
B) Forecasting
C) Smoothing (Hindcast)
s=t èFiltering
s>t èPredic'on
s<t èSmoothing
s
s
s
time
measurement
nowcast forecast
time = t-1 time = t
prediction
nowcast
p(xt−1,θ | y1:t−1)
xt = d(xt−1,θ ,et ) yt
p(xt ,θ | y1:t−1) p(xt ,θ | y1:t )
- We’ll drop parameters θ … for now
Single stage transition of system from time t-1 to time t Sequential Estimation
Probabilistic Solution for Filtering
p(xt | y1:t ) ∝ p( yt | xt ) p(xt | xt−1) ⋅ p(xt−1 | y1:t−1)dxt−1∫
è General Filtering Solution:
do for t=1, …, T, given p(x0 ) and
A single stage transition of the system for time t-1 to t involves:
Prediction:
Measurement update:
p(xt | y1:t−1) = p(xt | xt−1) ⋅ p(xt−1 | y1:t−1) dxt−1∫
p(xt | y1:t ) =p(yt | xt ) ⋅ p(xt | y1:t−1)
p(y1:t )
y1:t = (y1, y2,.., yT )
Sampling Based Estimation
Let {xt|t(i ),wt
(i )}, i = 1, ..., n be a random sample* from
drawn from the target distribu'on (the filter density)
p(xt | y1:t )
where: xt|t(i ) are the support points
wt(i )
are the normalized weights
The filter density can then be approximated as
p(xt | y1:t ) ≈ wt(i ) δ (xt|t − xt|t
(i ) )i=1
n
∑
*referred to an ensemble, made up of a collec'on of par'cles (which are sample members)
Sampling Based Estimation
−2 0 20
0.2
0.4
(a) unweighted ensemble, n=25
x
pro
babi
lity
−2 0 20
0.2
0.4
(b) weighted ensemble, n=25
x
pro
babi
lity
−2 0 20
0.2
0.4
(c) unweighted ensemble, n=250
x
pro
babi
lity
−2 0 20
0.2
0.4
(d) weighted ensemble, n=250
x
pro
babi
lity
Sequential Importance Sampling (SIS)
At time t-1 we have:
i {xt−1|t−1(i ) ,wt−1
(i ) }, draws from p(xt−1 | y1:t−1)i yt , an observation
If a candidate xt|t(i ) is drawn from a proposal q(xt|t )
then wt(i ) ∝ p(xt|t
(i ) | yt )q(xt|t
(i ) | yt )
or wt(i ) ∝wt−1
(i ) p(yt | xt|t(i ) ) ⋅ p(xt|t
(i ) | xt−1|t−1(i ) )
q(xt|t(i ) )
A draw from p(xt | y1:t ) is then available as {xt|t(i ),wt
(i )}see Arulampalam et al. 2002
SIS Remarks
• A common simplifica'on is to let the proposal be the the prior (the predic've density) leading to: q(xt|t ) = p(xt | xt−1)
wt(i ) ∝wt−1
(i ) p(yt | xt−1(i ) )
• The main prac'cal problem with SIS is ‘weight collapse” wherein aTer a few 'me steps, all the weights become concentrated on a few par'cles. How to fix?
• SIS solves the filtering problems for the nonlinear, non-‐Gaussian state space model via an algorithm for sequen'al “online” es'ma'on of the system state relying on a proposal and weight update.
Sequential Importance Resampling (SIR)
• The major breakthrough in sequential Monte Carlo methods (Gordon et al. 1993) was to incorporate a resampling step to obviate weight collapse.
i Specifically, for each time t, {xt|t−1(i ) ,wt
(i )} is a sample from p(xt | yt−1)
i We can resample (with replacement) the xt|t(i ) with a probability
proportional to wt(i ) (i.e. a weighted bootstrap)
i This yields a modified sample {xt|t(i )}, wherein particles were both killed
and replicated, and the weights are now equal, i.e. wt(i ) = 1∀ i.
Remaining Issue: sample impoverishment - some particles replicated often in the unweighted sample . Less severe than weight collapse (but of course related).
{xt|t(i )}
Basic Particle Filter Algorithm Given: (1) dynamical model, (2) observa'ons, (3) ini'al condi'ons
for t = 1 to T
(a) Prediction: {xt−1|t−1(i ) }→ {xt|t−1
(i ) } as xt|t−1
(i ) = d(xt−1|t−1(i ) ,et
(i ) ) for i = 1,...,n
(b) Measurement: {xt|t−1(i ) }→ {xt|t
(i )} as • wt
(i ) ∝ p(yt | xt|t−1(i ) ) for i = 1,...,n
• resample with replacement {xt|t−1(i ) } using weights wt
(i ) → yields {xt|t
(i )}
end (for t)
Particle Filter Schematic
Interpretation: Particle History
'me
state
R-code for a Basic Particle Filter
!for (t in 2:T)!{!!xf=t(apply(xa,1,dynamicmodel)) !!yobs=matrix(y[t,],1,ncol(y))! !w=apply(matrix(xf[,1],np,1),1,dmvnorm,x=yobs,sigma=R) !!m=sample(1:np,size=np,prob=w,replace=T);!xa=xf[m,] !!}!
predic'on step from t-‐1 to t
extract measurement at 'me t
assign weights
weighted resampling
Element 1. Prediction Step xt−1|t−1
(i ){ }draw from p(xt−1 | y1:t−1)
xt|t−1(i ){ }
draw from p(xt | y1:t−1)filter density at time t-1
predictive density at time t
Remarks: -‐ computa'onally expensive to generate large samples for
big GFD models -‐ hard to effec'vely populate a large dimension state space
with par'cles
Role: acts as proposal distribution for particle filter
xt|t−1(i ) = d(xt−1|t−1
(i ) ,et(i ) ) for i = 1,...,n
Element 2. Measurement Update xt|t−1
(i ){ }draw from p(xt | y1:t−1)
xt|t(i ){ }
draw from p(xt | y1:t )predictive density
at time t-1 filter density
at time t
uses p(xt|t | y1:t )∝ p(yt | xt|t−1) ⋅ p(xt|t−1)(Bayes’ Formula)
i PF/SIR: weighted resample of xt|t−1(i ) ,wt
(i ){ }→ xt|t(i ){ }
� Metropolis Hastings MCMC
� Ensemble Kalman filter (approximate)
� plus many others (auxiliary, unscented, ….) that modify proposal distributions
yt
Approaches:
Sequential Metropolis-Hastings Construct (independence) chain to yield sample from target
filter distribu5on .
xt|t(i ){ }, a sample from p(xt | y1:t )
• flexible, configurable and efficient (+ a parallel with SIR) • can alleviate sample impoverishment (resample-‐move)
1. Generate candidate from predictive density: draw xt|t−1
* from predictive density p(xt | y1:t−1)
α = min 1, p(yt | xt|t−1* )
p(yt | xt|t(k ) )
⎛⎝⎜
⎞⎠⎟
2. Evaluate acceptance probability
for k = 1 to K
3. Let xt|t* enter the posterior sample {xt|t
(i )} with probability α ,otherwise retain xt|t
(k )
end (for k) yields
Limitations
Theory: convergence to target marginal distribu'on holds under weak assump'ons (e.g. Kunsch 2005) Prac7cal Problem: The standard par'cle filtering (SIR) when used in DA suffers from sample impoverishment (recall path degeneracy) Typically, the predic'on ensemble far away from the observed state (due to small ensemble size, and high dimensional state space). è likelihood is poorly represented and es'ma'on compromised.
What modifica5ons are made to improve par5cle filters ?
p(xt | y1:t )
Add “Ji;er”: overdisperse the sample of predic've density (inflate process error) Make a Gaussian approxima5on, or transforma5on/anamorphosis: Can Use Kalman filter upda'ng equa'on
Approximate the Likelihood func5on: change its func'onal form, or inflate or alter measurement error.
Error Subspace: confine stochas'city in parameters only. Dimension reduc'on.
Use Fixed lag smoother, Batch processing incorporate observa'ons from mul'ple 'mes into observa'on update. Clever Proposal Distribu5ons and look-‐ahead filters: move beyond using “prior” (predic've density) as proposal.
Some Fixes
What about Parameter Estimation?
xt=xtθt
⎛
⎝⎜⎜
⎞
⎠⎟⎟
1. State Augmentation: append parameters to the state
Can use particle filter machinery to solve (Kitagawa 1998). Also, see multiple iterative filtering (Ionides 2006)
2. Likelihood Methods: Use sample based likelihoods
[y1:T |θ ]= L(θ | y1:T ) = [yt | xt ,θ ][xt | y1:t−1,θ ]dxt∫t=1
T
∏
≈ 1nt=1
T
∏ p(yt | xt|t−1(i )
i=1
n
∑ )
Summary
1. State space model is sta's'cal framework for considering dynamic models and measurements
2. Importance sampling provides theore'cal basis for par'cle filters
3. Sequen'al importance resampling (SIR) yields a straighforward and workable algorithm for the filtering problem
4. For the realis'c problems in data assimila'on, par'cle filters must be modified either via approxima'ons, or clever use of proposal distribu'ons.
Some Key References Jazwinski AH. 1970. Stochas'c Processes and Filtering Theory. Academic Press: New York, 376pp. Gordon NJ, Salmond DJ, Smith AFM. 1993. Novel approach to nonlinear/non-‐Gaussian Bayesian state es'ma'on. IEE Proceedings-‐F 140(2):107-‐113. Kitagawa G. 1996. Monte Carlo filter and smoother for non-‐Gaussian nonlinear state space models. Journal of Computa'onal and Graphical Sta's'cs 5: 1-‐25. Kitagawa G. 1998. A self-‐organizing state space model. Journal of the American Sta's'cal Associa'on. 93(443):1203-‐1215. Arulampalam MS, Maskell S, Gordon N, Clapp T, 2002. A tutorial on par'cle filters for online nonlinear non-‐Gaussian Bayesian tracking. IEEE Transac'ons on Signal Processing 50(2):174-‐188. Kunsch HR. 2005. Recursive Monte Carlo filters: algorithms and theore'cal analysis. The Annals of Sta's'cs. 33(5):1983-‐2021. Godsill SJ, Doucet A, West M. 2004. Monte Carlo Smoothing for Nonlinear Time Series. Journal of the American Sta's'cal Associa'on. 99(465):156-‐168. Ionides EL, Breto C, King AA. 2006. Inference for nonlinear dynamical systems. Proceedings of the Na'onal Academy of Sciences 103: 18438-‐18443.