Algorithm Report

8/8/2019 Algorithm Report

1/35

Matlab programs for complete andincomplete data exact VARMA

likelihood and its gradient

Kristjn Jnasson

Report VH-02-2006Reykjavk, September 2006


2/35

Report VH-02-2006, Reykjavk September 2006

Kristjn Jnasson. Matlab programs for complete and incomplete data exact VARMA likelihood and its gradient.,Engineering Research Institute, University of Iceland, Technical report VHI-02-2006, September 2006

Kristjn Jnasson, Department of Computer Science, University of Iceland, Hjardarhagi 4, IS-107 Reykjavik,Iceland. Email: [email protected].

The author is responsible for the opinions expressed in this report. These opinions do not necessarily represent theposition of the Engineering Research Institute or the University of Iceland.

Engineering Research Institute, University of Iceland, and the author.

Engineering Research Institute, University of Iceland, Hjararhagi 4, IS-107 Reykjavk, Iceland


3/35

CONTENTS

1. INTRODUCTION 3

2. FUNCTION VALUE AND GRADIENT OF LOG-LIKELIHOOD 3 2.1 VAR models 3 2.2 Complete data VARMA models 4 2.3 Missing value VARMA models 5 2.4 Model parameters given by a function 5

3. SIMULATION 5

4. TESTING 6

APPENDIX A. MAIN MATLAB PROGRAMS AND SUBFUNCTIONS 7

APPENDIX B. TEST PROGRAMS 21

REFERENCES 33


4/35

GRIP

Matlab fll til a reikna t nkvmt log-sennileikafall fyrir VAR- og VARMA-tmaraalkn eru sett fram oglst. Fllin ra vi gt mlirunum. au reikna ennfremur stigul sennileikafallsins, en hann ntist vi stikamatme tlulegri hmrkun sennileikafallsins. Fllin bja upp sparna msum srtilvikum t.d. vi mat rsta-bundnum lknum. Fallasafninu fylgir forrit til a herma VARMA-tmarair, og hafa hermdu rairnar rttadreifingu strax fr fyrsta li. Fallasafninu fylgja prfunarforrit, bi til a stafestingar villuleysi og til nota vifrekari run safnsins.

ABSTRACT

Matlab functions for the evaluation of the exact log-likelihood of VAR and VARMA time series models arepresented (vector autoregressive moving average). The functions accept incomplete data, and calculate analyticalgradients, which may be used in parameter estimation with numerical likelihood maximization. Allowance ismade for possible savings when estimating seasonal, structural or distributed lag models. Also provided is afunction for creating simulated VARMA time series that have an accurate distribution from term one (they arespin-up free). The functions are accompanied by a test suite for verifying their correctness and aid furtherdevelopment.


5/35

3

1. INTRODUCTION

Several Matlab functions to aid in the analysis of multivariate time series models have been written and

are available at the authors web page, www.hi.is/~jonasson. In addition, the source code is listed in Ap-penxix A below. There are three functions for evaluating exact log-likelihood, a function for simulatingtime series, and a suite of test functions for verifying the correctness of the other functions. The threelog-likelihood evaluation functions are varma_llc for VARMA (vector autoregressive moving aver-age) models with complete data, varma_llm for VARMA models with missing values and var_ll for VAR models with or without missing values. All three functions can optionally calculate the gradi-ent of the log-likelihood, estimates of missing values, and estimates of associated residual or shock se-ries.

The simulation function is named varma_sim , and it may be used to generate a single VARMA orVAR series, or several series at a time sharing the same parameters. One of the novelties of

varma_sim is that the initial values are drawn from the appropriate distribution, so that throwingaway the first part of the series to avoid spin-up effects is not needed.

The programs implement the methods described in the companion report [Jonasson and Ferrando 2006]and they follow very closely the notation used there (including variable names). The companion reportalso describes numerical experiments carried out with the Matlab functions.

The development of the software was inspired to an extent by the testing philosophy popularized withextreme programming : write test programs for everything, preferably before writing the actual algo-rithms being implemented; see for example [George and Williams 2004]. Thus, the confidence in thecorrectness of the coded algorithms is fairly high.

Previously published programs for VARMA and VAR likelihood evaluation are the Fortran programsof Shea [1989] and Mauricio [1997] and the Matlab programs of Schneider and Neumaier [2001]. At-tention should also be drawn to E 4 by J. Terceiro and others. It is a collection of Matlab functions forstate-space estimation of econometric models. E

4is distributed under the GNU license and available on

the web along with a user manual: www.ucm.es/info/icae/e4. The software and manual have not beenpublished, but there are some related publications listed on this web page, including [Terceiro 1990].

2. FUNCTION VALUE AND GRADIENT OF LOG-LIKELIHOOD

There are three functions for likelihood evaluation supplied: var_ll implements the savings described insection 3.3 of the companion report for both the complete data and missing value cases, varma_llc im-plements the method of section 2.2 and varma_llm the method of section 3.1 in the companion report.The functions can also find gradients, residual estimates and missing value estimates, c.f. sections 3.2and 4 of the companion report. When observations are missing, the functions accept the mean-vector asa proper model parameter (c.f. the introduction to the companion report), but the complete data callsassume a zero-mean model. To fit a non-zero-mean time-series, the mean-vector of the observationsmay be subtracted from each xt at the outset. Help for all three functions is obtained by giving the com-mand help function at the Matlab prompt (e.g. help var_ll ).

2.1 VAR models

A zero-mean VAR model describing a time series of values xt r , t = 1,, n, is given by:


6/35

4

1

p

t j t j t j

A =

= +x x (1)

where the A js are r r parameter matrices and the t s are r- variate N (0, ) uncorrelated in time. Toevaluate the exact log-likelihood function associated with (1) when no observations are missing use theMatlab call:

[ll,ok] = var_ll(X,A,Sig)

where A is an r rp matrix containing 1[ ] p A A , Sig is r r and symmetric containing , and X is anr n matrix with xt in its t -th column. Let

n R with 2 ( 1) 2n pr r r = + + denote the vector of parame-ters, i.e. the elements of A1,, A p (column by column) and the lower triangle of . If they describe astationary model, ll will return a scalar with the value of the log-likelihood function ( ) l and the logi-cal variable ok will return true, but if the model is non-stationary ll will be 0 and ok will be false. Tocalculate the 1 n gradient ( )l in ll or the (maximum likelihood) estimate of the residuals (orshocks) t in res use the calls:

[ll,ok,lld] = var_ll(X,A,Sig)

[ll,ok,res] = var_ll(X,A,Sig,'res') .

A non-zero-mean VAR model may be written:

1

( ) p

t j t j t j

A =

= +x x (2)

If X, A and Sig are as before, mu contains and miss is an r n logical matrix, which is true in locationsof missing values, the Matlab call:

[ll,ok] = var_ll(X,A,Sig,mu,miss)

will return the log-likelihood value in ll (or zero ll and false ok if the model is non-stationary). Tocalculate the gradient, residuals, or residuals and maximum likelihood estimates of missing values usethe calls

[ll,ok,lld] = var_ll(X,A,Sig,mu,miss)[ll,ok,res] = var_ll(X,A,Sig,mu,miss,'res')[ll,ok,res,xm] = var_ll(X,A,Sig,mu,miss,'res_miss') .

( is placed at the end of and n is now2 ( 3) 2 pr r r + + ).

2.2 Complete data VARMA models

A zero-mean VARMA model for xt r , t = 1,, n, is given by

1

p

t j t j t j

A =

= +x x y (3)

where 1q

t t j t j j B == + y , the A js and the B js are r r matrices, and the t s are r- variate N (0, ) un-correlated in time. If X, A and Sig are as in section 2.1 and B contains 1[ ]q B B , the Matlab call

[ll,ok] = varma_llc(X,A,B,Sig)

will return the likelihood function value in ll and ok will be true unless the model is non-stationary,then ok will be false. The calls

[ll,ok,lld] = varma_llc(X,A,Sig)

[ll,ok,res] = varma_llc(X,A,Sig,'res')


7/35

5

return in addition the 1 2( ) ( 1) 2 p q r r r + + + gradient ( )l in lld or the residual estimates in res ,where consists of the parameters: the columns of the A js followed by the columns of the B js fol-lowed by the columns of the lower triangle of .

2.3 Missing value VARMA models

Let be the expected value of xt r

and let other model parameters as well as yt and t be as in sec-tion 2.2. A non-zero-mean VARMA model for xt , t = 1,, n, is given by

1

( ) p

t j t j t j

A =

= +x x y (4)

If X, A, B and Sig are as in section 2.2, mu contains and miss is an r n logical matrix which is truein locations of missing values then the Matlab call

[ll,ok] = varma_llm(X,A,B,Sig,mu,miss)

will return the likelihood function value in ll and ok will be true unless the model is non-stationary.The calls

[ll,ok,lld] = varma_llm(X,A,B,Sig,mu,miss)[ll,ok,res] = varma_llm(X,A,B,Sig,mu,miss,'res')[ll,ok,res,xm] = varma_llm(X,A,B,Sig,mu,miss,'res_miss')

return in addition the 1 2( ) ( 3) 2 p q r r r + + + gradient ( )l in lld , residual estimates in res , and thelast one returns maximum likelihood estimates of missing values in xm, where has been appended tothe of section 2.2.

2.4 Model parameters given by a function

Let ( )g = wheren R and let J g be the n n Jacobian of g as in section 4.3 of the companion re-

port. To realize the savings discussed there, use one of the calls[ll,ok,lld] = var_ll(X,A,Sig,J)[ll,ok,lld] = var_ll(X,A,Sig,mu,miss,J)[ll,ok,lld] = varma_llc(X,A,Sig,J)[ll,ok,lld] = varma_llm(X,A,B,Sig,mu,miss,J)

where J contains J g. A partial variable change is also possible; c.f. the help text of the functions. To takean example, assume a distributed lags model,

3

1t j t j t

j

b =

= +x x

where the b js are fixed constants and the model parameters consist of the r r matrix C together withthe shock covariance matrix . To evaluate the likelihood and its gradient efficiently the following Mat-lab code may be used:

A = [b(1)*C b(2)*C b(3)*C];I = eye(3*r);JC = [I; I; I];JSig = eye(r*(r+1)/2);J = blkdiag(JC, JSig);[ll,ok,lld] = var_ll(X,A,Sig,J);

3. SIMULATIONThe Matlab function varma_sim will generate a random VARMA time series for a specified model. If A, B and Sig are as above, then the calls


8/35

6

X = varma_sim(A,[],Sig,n)X = varma_sim(A,B,Sig,n)

generate a single zero-mean n-term series modelled by (1) or (3) in the r n matrix X. The calls

X = varma_sim(A,[],Sig,n,[],M)X = varma_sim(A,B,Sig,n,[],M)

will create Msuch series. When r = 1 X will be n M and when r > 1 it will be r n M . To generatenon-zero-mean series as modelled by (2) or (4) use the calls

X = varma_sim(A,B,Sig,n,mu)X = varma_sim(A,B,Sig,n,mu,M) ,

possibly with empty B. The series are started using the procedure described in the second paragraph of Appendix D in the companion report, and when moving average terms are present the initial shocks arealso drawn as described there.

It is also possible to specify terms to start the series using

X = varma_sim(A,B,Sig,n,mu,M,X0)

where X0 has r rows and at least max( , ) p q columns. All the generated series will begin with the lastmax( , ) p q columns of X0 and the corresponding shocks are drawn as explained above. As before, A, B and/or mu may be empty.

The shocks used for the generation may be obtained by specifying a second return parameter: [X,eps]= varma_sim( ) . The dimension of eps will be same as that of X.

4. TESTING

The programs in the test suite are of two types: primary tests, for testing the four main functions dis-cussed above, and secondary tests, that test individual components (subfunctions, helper functions) of the main functions. To verify the correctness of the main functions it is only necessary to examine andrun the primary tests. The secondary tests were written as an aid in developing the program suite. Theyare included for completeness, and as an aid for possible future development and changes. The primarytests are:

test_varma_llctest_varma_llc_derivtest_var_lltest_var_ll_jactest_varma_llmtest_varma_jactest_varma_llm_derivtest_varma_sim

The correctness of varma_llc is checked against direct likelihood evaluation with equation (2.3) of the companion report. The value is also compared with that from Algorithm AS311 of Mauricio [1997](a Matlab mex-file for that program is included in the test-suite). The function varma_llm is checkedagainst varma_llc for complete data, and against direct evaluation with equation (2.4) of the compan-ion report for missing values, and var_ll is simply compared with varma_llm . Gradient calculationand the Jacobian feature (see section 2.4) are checked by comparing with numerical differentiation. Alltests are carried out for several different test cases with a range of values of p , q and r . Finally the test-ing of varma_sim is accomplished by comparing data expectations and covariances of generated series

with theoretical ones. All the primary tests may be run via the Matlab script TEST_PRIMARY, and withTEST_ALL the secondary tests are also run.


9/35

7

APPENDIX A. MAIN MATLAB PROGRAMS AND SUBFUNCTIONS % VARMA_LLC VARMA likelihood and its derivative for complete data % % [LL, OK] = VARMA_LLC(X, A, B, Sig) sets LL to the exact log-likelihood % function value for n complete observations of the r-variate zero-mean VARMA % time series: % % x(t) = A1x(t-1) + ... + Apx(t-p) + y(t) % where % y(t) = eps(t) + B1eps(t-1) + ... + Bqeps(t-q), % % x(t) is r-dimensional and eps(t) is r-variate normal with mean 0 and % covariance Sig. Sig and the Ai's and Bi's are rr matrices. On entry A % should be the r rp matrix [A1 A2...Ap], B should be [B1 B2...Bq] and X % should have x(t) in its t-th column for t = 1,...,n. OK = true indicates % success, but is false if the model is non-stationary. % % [LL, OK, LLD] = VARMA_LLC(X, A, B, Sig) returns the log-likelihood function % value in LL and its gradient in LLD. % % [LL, OK, EPS] = VARMA_LLC(X, A, B, Sig, 'res') returns the maximum % likelihood estimate of the residuals in EPS. % % [LL, OK, LLD] = VARMA_LLC(X, A, B, Sig, J) may be used to speed up the % gradient calculation when A, B and/or Sig depend on a smaller set of % independent variables, and only the gradient w.r.t. this smaller set is % sought (for instance to fit structural models, distributed lags, and other % models with constraints on the parameter matrices). J gives the Jacobian of % the change of variables and can have r^2p, r^2(p+q) or r^2(p+q) + % r(r+1)/2 rows, for when A, A and B, or A, B and Sig depend on a smaller % set, respectively (the column count of J equals the number of variables in % the smaller set). Thus the possible variable changes are: theta --> A, theta % --> [A B], or, theta --> [A B vech(Sig)]. % % The so-called "Cholesky decomposition" method is used, see [1] and [2]. % % [1] K Jonasson and SE Ferrando 2006. Efficient likelihood evaluation for % VARMA processes with missing values. Report VHI-01-2006, Engineering % Research Institute, University of Iceland. % % [2] K Jonasson 2006. Matlab programs for complete and incomplete data exact % VARMA likelihood and its gradient. Report VHI-02-2006, Engineering% Research Institute, University of Iceland. % % Kristjn Jnasson, Dept. of Computer Science, University of Iceland, 2006. % [email protected].

function [ll, ok, out3] = varma_llc(X, A, B, Sig, J_code)code = '' ; J = [];if nargin > 4, if ischar(J_code), code = J_code; else J = J_code; end , end FINDRES = isequal(code, 'res' );DIFF = nargout == 3 && ~FINDRES;CHGVAR = nargin>=5 && ~FINDRES;[p, q, r, n] = get_dimensions(A, B, Sig, X);ko = 0:r:r*n;x = X(:);ll = 0; out3 = 0;PLU = vyw_factorize(A);vyw_ok = isempty(PLU) || isempty(PLU{1}) || PLU{1}(1)~=0;ok = vyw_ok && is_stationary(A, Sig, PLU);if ~ok,

if nargout


10/35

8

logdetO = omega_logdet(Luo, Llo, p, q, ko);logdetLR = sum(log(diag(LR)));logdetLQ = sum(log(diag(LQ)));logdetSm = 2*sum(log(diag(chol(Sm))));logdetSo = logdetO + 2*logdetLR + 2*logdetLQ - 2*logdetSm;% ll = -0.5*(nObs*log(2*pi) + logdetSo + xSxo);if FINDRES

[eps,xm] = res_miss(A,C,Luo,Llo,wohat,LR,LQ,K,u,v,Laomh,Vhat,Sm,miss);varargout{1} = eps;if nargout>=4, varargout{2} = xm + mubar(miss); end

end else % FIND ALSO GRADIENT

if MEAN, xod = xmu_deriv (nPar + r, miss);else xod = zeros (nObs, 1, nPar); end [wo, wod] = lambda_multiply (A, xo, miss, xod);[CCd, GGd, WWd] = find_CGW_deriv (A, B, Sig);[C, Cd] = der2array (CCd);[G, Gd] = der2array (GGd);[W, Wd] = der2array (WWd);S = vyw_solve (A, PLU, GGd);RHS = vyw_deriv_rhs (A, GGd, S);Sd = vyw_solve (A, PLU, RHS);[Gneg, Gnegd] = find_Gneg (A, B, C, n, Cd);[Scol, Scold] = S_extend (A, G, S, n, Gd, Sd);[Acol, Acold] = find_Acol (A, r, nPar);if CHNGVAR

wod = chng_var(wod, J);Gd = chng_var(Gd, J);Wd = chng_var(Wd, J);Sd = chng_var(Sd, J);Gnegd = chng_var(Gnegd, J);Scold = chng_var(Scold, J);Acold = chng_var(Acold, J);

end [Su, Olow] = omega_build (S, G, W, p, n);[Sud, Olowd] = omega_build_deriv (Sd, Gd, Wd, p, n);[Su,Olow,Sud,Olowd] = omega_remove_miss (Su, Olow, miss, Sud, Olowd);[Lu,Ll,flg,Lud,Lld] = omega_ltl (Su, Olow, p, q, ko, Sud, Olowd);clear Su Sud Olow Olowd [Laom, Laomd] = find_lambda_om (Acol, miss, Acold); % Lambda_om [Laomh, Laomhd] = profile_back_sub (Lu, Ll, Laom, ko, km, p, q, p ...

, Lud, Lld, Laomd);clear Laom Laomd [V, Vd] = find_V (G, Gneg, Scol, miss, Gd,Gnegd,Scold);[Vhat, Vhatd] = profile_back_sub (Lu, Ll, V, ko, km, p, q, q ...

, Lud, Lld, Vd);clear V Vd [RLam, RLamd] = profile_ata (Laomh, ko, km, p, p, Laomhd);[P, Pd] = profile_mult (Laomh, Vhat, ko, km, p, p, q ...

, Laomhd, Vhatd);if MEAN

Laomhd = add_mu_deriv (r, Laomhd);[Lud, Lld] = add_mu_deriv (r, Lud, Lld);

end % [woh, wohd] = omega_back_sub (Lu, Ll, wo, p, q, ko, Lud, Lld, wod);[Lw, Lwd] = profile_mult (Laomh, woh, ko,km,p,p,n,Laomhd,wohd);clear Laomh Laomhd [RV,RVd] = profile_ata (Vhat, ko, km, p, q, Vhatd);if MEAN

Vhatd = add_mu_deriv (r, Vhatd); end [Vw, Vwd] = profile_mult (Vhat, woh, ko, km, p,q,n,Vhatd,wohd);clear Vhat Vhatd [Sm, Smd] = find_Sm (Scol, miss, Scold);[SmP, SmPd] = atb_deriv (Sm, Smd, P, Pd); clear P Pd R = Sm + RV - SmP - SmP';Rd = Smd + RVd - SmPd - permute(SmPd,[2,1,3]);[R, Rd] = atba_c (Sm, RLam, R, Smd ...

, RLamd, Rd); clear RLam RLamd [LR, LRd] = chol_deriv (R, Rd); clear R Rd [K, Kd] = forward_sub_deriv (LR, LRd, RV-SmP ...

, RVd-SmPd); clear SmP SmPd Q = Sm - RV + K'*K; clear RV Qd = ata_deriv (K, Kd) + Smd - RVd; clear RVd [LQ, LQd] = chol_deriv (Q, Qd); clear Q Qd if MEAN

[Smd,LRd,LQd,Kd] = add_mu_deriv (r, Smd, LRd, LQd, Kd);end VwSmLw = Vw - Sm'*Lw;VwSmLwd = Vwd - atb_deriv (Sm, Smd, Lw, Lwd);[u, ud] = forward_sub_deriv (LR, LRd, VwSmLw, VwSmLwd);VwKu = Vw - K'*u;VwKud = Vwd - atb_deriv (K, Kd, u, ud);[v, vd] = forward_sub_deriv (LQ, LQd, VwKu, VwKud);% xSxo = woh'*woh - u'*u + v'*v;xSxod = 2*(woh'*Jac(wohd) - u'*Jac(ud) + v'*Jac(vd));% [ldO, ldOd] = omega_logdet(Lu, Ll, p, q, ko, Lud, Lld);[ldLR, ldLRd] = logdet_L(LR, LRd);[ldLQ, ldLQd] = logdet_L(LQ, LQd);[LSm, LSmd] = chol_deriv(Sm, Smd);[ldSm, ldSmd] = logdet_L(LSm, LSmd);ldSo = ldO + 2*ldLR + 2*ldLQ - 4*ldSm;ldSod = ldOd + 2*ldLRd + 2*ldLQd - 4*ldSmd;% ll = -0.5*(nObs*log(2*pi) + ldSo + xSxo); lld = -0.5*(ldSod + xSxod);varargout{1} = lld;

end end

function J = Jac(xd) % Change 3-dim derivative to Jacobian matrix J = permute(xd, [1,3,2]);

end

% VAR_LL Vector-autoregressive likelihood and optionally its derivative % % [LL, OK] = VAR_LL(X, A, Sig) sets LL to the exact log-likelihood function % value for n observations of the r-variate zero-mean VAR time series: % % x(t) = A1x(t-1) + ... + Apx(t-p) + eps(t) % % where x(t) is r-dimensional and eps(t) is r-variate normal with mean 0 and % covariance Sig. Sig and the Ai's are rr matrices. A should contain the r % rp matrix [A1 A2...Ap] and X should have x(t) in its t-th column for t = % 1,...,n. To use this likelihood routine to fit a non-zero-mean time-series, % the mean-vector of the observations may be subtracted from each x(t). OK = % true indicates success, but is false if the model is non-stationary. % % [LL, OK] = VAR_LL(X, A, Sig, mu, miss) is used when there are missing % values. In that case it is appropriate to let the mean-vector of the series, % mu, be a parameter, and use the model: % % x(t) - mu = A1(x(t-1) - mu) + ... + Ap(x(t-p) - mu) + eps(t) % % The rn logical array miss should be true in missing locations. An empty mu % is equivalent to a zero mu. % % [LL, OK, EPS] = VAR_LL(..., 'res') returns the maximum likelihood estimate % of the residuals in EPS and [LL, OK, EPS, XM] = VAR_LL(...,'res_miss') finds % in addition the maximum likelihood estimate of the missing values in XM. % % [LL, OK, LLD] = VAR_LL(X, A, Sig) and [LL, OK, LLD] = VAR_LL(X, A, Sig, mu, % miss) return the log-likelihood function value in LL and its gradient in % LLD. If mu is empty (or not present), LL is as if mu were zero, but LLD is % without mu-derivatives. % % [LL,OK,LLD] = VAR_LL(...,J) may be used to speed up the gradient calculation % when A and/or Sig depend on a smaller set of independent variables and only % the gradient w.r.t. this smaller set is sought. J gives the Jacobian of the % variable change, and should have either r^2p rows or r^2p + r(r+1)/2 % rows. In the latter case both A and Sig depend on a smaller set, but in the % former case it is only A that does. Thus the possible variable changes are % theta --> A and theta --> [A vech(Sig)]. See also help text of varma_llc. % % The so-called "Cholesky decomposition" method is used, see [1] and [2]. % % [1] K Jonasson and SE Ferrando 2006. Efficient likelihood evaluation for % VARMA processes with missing values. Report VHI-01-2006, Engineering % Research Institute, University of Iceland. % % [2] K Jonasson 2006. Matlab programs for complete and incomplete data exact % VARMA likelihood and its gradient. Report VHI-02-2006, Engineering% Research Institute, University of Iceland. % % Kristjn Jnasson, Dept. of Computer Science, University of Iceland, 2006. % [email protected].

function [ll, ok, varargout] = var_ll(X, A, Sig, varargin)if nargin>3 && ischar(varargin{end}), code = varargin{end}; else code = '' ; end MISS = nargin >=5 && ~iscell(varargin{1}) && islogical(varargin{2});FINDRES = any(strmatch(code, { 'res' , 'res_miss' }, 'exact' ));FNDGRAD = nargout == 3 && ~FINDRES; % find gradient CHNGVAR = FNDGRAD && (MISS && nargin>=6 || ~MISS && nargin>=4);LowTri.LT = true;[p, q, r, n] = get_dimensions(A, [], Sig, X); % assert(p>0); nPar = r^2*p + r*(r+1)/2;if ~MISS

if nargin > 3 && islogical(varargin{1}), error( 'wrong arguments' ), end MEAN = false;mu = zeros(r,1);miss = false(r,n);if CHNGVAR, J = varargin{1}; end

else [mu, miss] = deal(varargin{1:2});mu=mu(:);MEAN = ~isempty(mu);if ~MEAN, mu = zeros(r,1); end if CHNGVAR, J = varargin{3}; end

end [ko,ro,km] = find_missing_info(miss);mubar = repmat(mu, n, 1);X = X(:);xo = X(~miss) - mubar(~miss);nObs = length(xo);if nObs == n*r, MISS = false; end ll = 0;varargout = {0};PLU = vyw_factorize(A);vyw_ok = isempty(PLU) || isempty(PLU{1}) || PLU{1}(1)~=0;ok = vyw_ok && is_stationary(A, Sig, PLU);if ~ok, if nargout


11/35

9

K = linsolve (LR, RV - SmP, LowTri); clear SmP Q = Sm - RV + K'*K; clear RV LQ = chol (Q')'; clear Q u = linsolve (LR, Vw - Sm*Lw, LowTri);v = linsolve (LQ, Vw - K'*u, LowTri);xSxo = wohat'*wohat - u'*u + v'*v;logdetLR = sum (log(diag(LR)));logdetLQ = sum (log(diag(LQ)));logdetSm = 2*sum (log(diag(chol(Sm))));logdetSo = logdetO + 2*logdetLR + 2*logdetLQ - 2*logdetSm;

else xSxo = wohat'*wohat;logdetSo = logdetO;LR = []; LQ = []; K = []; u = []; v = []; Laomh = []; Vhat = []; Sm = [];

end ll = -0.5*(nObs*log(2*pi) + logdetSo + xSxo);if FINDRES && MISS

[eps,xm]=res_miss_ar(A,Sig,Lu,LSig,wohat,LR,LQ,K,u,v,Laomh,Vhat,Sm,miss);varargout = {eps, xm + mubar(miss)};

elseif FINDRES[eps,xm] = res_miss_ar(A,Sig,Lu,LSig,wohat);varargout = {eps, xm};

end

else % FIND ALSO GRADIENT if MEAN

xod = xmu_deriv(nPar + r, miss);else

xod = zeros(nObs, 1, nPar);end [wo, wod] = lambda_multiply (A, xo, miss, xod);SigS = mds_set_parmat (p+1, Sig, p+1);Sigd = der2array (SigS);[C,G,Cd,Gd] = deal ({Sig}, {Sig}, Sigd, Sigd);S = vyw_solve (A, PLU, {Sig});RHS = vyw_deriv_rhs (A, SigS, S);Sd = vyw_solve (A, PLU, RHS);if MISS

[Gneg, Gnegd] = find_Gneg (A, [], C, n, Cd);[Scol, Scold] = S_extend (A, G, S, n, Gd, Sd);[Acol, Acold] = find_Acol (A, r, nPar);

end if CHNGVAR

wod = chng_var(wod, J);Sigd = chng_var(Sigd, J);Gd = chng_var(Gd, J);Sd = chng_var(Sd, J);if MISS

Gnegd = chng_var(Gnegd, J);Scold = chng_var(Scold, J);Acold = chng_var(Acold, J);

end end [Lu,LSig,jpat,Lud,LSigd] = omega_ar_factor(S, Sig, miss, Sd, Sigd{1});if MISS

[Laom, Laomd] = find_lambda_om (Acol, miss, Acold); % Lambda_om [Laomh,Laomhd] = profile_ar_trisolve (Lu, LSig, Laom, jpat, ko, km, p ...

, p, 'N' , Lud, LSigd, Laomd);clear Laom Laomd [V, Vd] = find_V (G, Gneg,Scol,miss,Gd,Gnegd,Scold);[Vhat, Vhatd] = profile_ar_trisolve (Lu, LSig, V, jpat, ko, km, p, q ...

, 'N' , Lud, LSigd, Vd);clear V Vd

end if MEAN, [Lud,LSigd] = add_mu_deriv (r, Lud, LSigd); end [woh, wohd] = omega_ar_trisolve (Lu,LSig,wo,jpat,ko, 'N' ,Lud,LSigd,wod);[ldO, ldOd] = omega_ar_logdet (Lu, LSig, jpat, Lud, LSigd);if MISS

[Rlam, Rlamd] = profile_ata (Laomh, ko, km, p, p, Laomhd);[P, Pd] = profile_mult (Laomh, Vhat,ko,km,p,p,q,Laomhd,Vhatd);if MEAN

Laomhd = add_mu_deriv (r, Laomhd);end [Lw, Lwd] = profile_mult (Laomh, woh, ko, km,p,p,n,Laomhd,wohd);clear ( 'Laomh' , 'Laomhd' )[RV, RVd] = profile_ata (Vhat, ko, km, p, q, Vhatd);if MEAN

Vhatd = add_mu_deriv (r, Vhatd);end[Vw, Vwd] = profile_mult (Vhat, woh, ko, km, p, q,n,Vhatd,wohd);clear ( 'Vhat' , 'Vhatd' )[Sm, Smd] = find_Sm (Scol, miss, Scold);[SmP, SmPd] = atb_deriv (Sm, Smd, P, Pd); clear P Pd R = Sm + RV - SmP - SmP';Rd = Smd + RVd - SmPd - permute (SmPd,[2,1,3]);[R, Rd] = atba_c (Sm, Rlam, R, Smd ...

, Rlamd, Rd); clear RLam RLamd [LR, LRd] = chol_deriv (R, Rd);[K, Kd] = forward_sub_deriv (LR, LRd, RV-SmP ...

, RVd-SmPd); clear SmP SmPd Q = Sm - RV + K'*K; clear RV Qd = ata_deriv (K, Kd) + Smd - RVd; clear RVd [LQ, LQd] = chol_deriv (Q, Qd); clear Q Qd if MEAN

[Smd, Kd] = add_mu_deriv (r, Smd, Kd);[LRd, LQd] = add_mu_deriv (r, LRd, LQd);

end VwSmLw = Vw - Sm'*Lw;VwSmLwd = Vwd - atb_deriv (Sm, Smd, Lw, Lwd);[u, ud] = forward_sub_deriv (LR, LRd, VwSmLw, VwSmLwd);VwKu = Vw - K'*u;VwKud = Vwd - atb_deriv (K, Kd, u, ud);[v, vd] = forward_sub_deriv (LQ, LQd, VwKu, VwKud);% xSxo = woh'*woh - u'*u + v'*v;xSxod = 2*(woh'*Jac(wohd) - u'*Jac(ud) + v'*Jac(vd));% [ldLR, ldLRd] = logdet_L (LR, LRd);[ldLQ, ldLQd] = logdet_L (LQ, LQd);

[LSm, LSmd] = chol_deriv (Sm, Smd);[ldSm, ldSmd] = logdet_L (LSm, LSmd);ldSo = ldO + 2*ldLR + 2*ldLQ - 4*ldSm;ldSod = ldOd + 2*ldLRd + 2*ldLQd - 4*ldSmd;

else xSxo = woh'*woh;xSxod = 2*(woh'*Jac(wohd));ldSo = ldO;ldSod = ldOd;

end

ll = -0.5*(nObs*log(2*pi) + ldSo + xSxo); lld = -0.5*(ldSod + xSxod);varargout{1} = lld;

end end

function J = Jac(xd) % Change 3-dim derivative to Jacobian matrix J = permute(xd, [1,3,2]);

end

% VARMA_SIM Simulate an ARMA or a VARMA time series model %% ARMA MODELS: % X = VARMA_SIM(A,B,sigma,n) with scalar sigma generates a zero-mean time % series of length n using the ARMA(p,q) model: %% x(t) = A1x(t-1) + ... + Apx(t-p) + y(t) (1a) % where: % y(t) = eps(t) + B1eps(t-1) + ... + Bqeps(t-q) (1b) %% and eps(t) is N(0,sigma). A should be the row vector [A1,...,Ap] and B % should be [B1,...,Bq]. X returns the series as a row vector. To start the % simulation, x(1),...,x(g),eps(1),...,eps(g) are drawn from N(0,SCE) where % SCE = [SS CC; CC' EE] is obtained by solving the modified Yule-Walker % equations (see vyw_factorize, vyw_solve and S_build), so there is no need % to throw away the first values to avoid spin-up effects. The model must be % stationary. %% X = VARMA_SIM(A,B,sigma,n,mu) returns a time series with mean mu using % y(t) as above, and x(t) given by: %% x(t) - mu = A1(x(t-1) - mu) + ... + Ap(x(t-p) - mu) + y(t) (2) %% In this case x and eps are drawn from N([mu; 0], SCE). %% X = VARMA_SIM(A,B,sigma,mu,M) generates M sequences simultaneously and % returns the i-th one in the i-th row of X. Use mu=[] to obtain zero-mean. %% X = VARMA_SIM(A,B,sigma,n,mu,M,X0) sets x(1)...x(g) to X0(end-p+1:end) - % mu and draws eps(1)...eps(g) from the conditional distribution of eps(1), % ..., eps(g) given x(1),...,x(g)). For this option the model need not be % stationary. %% [X,EPS] = VARMA_SIM(...) returns also the shock series, eps(t), in the % t-th row of EPS. %% VARMA MODELS: % X = VARMA_SIM(A,B,Cov,n), where Cov is an rr matrix uses a VARMA(p,q) % model given by (1), but now x(t) is r-dimensional, eps(t) is r-variate % normal with mean 0 and covariance Cov, and Cov and the Ai's and Bi's are % rr matrices. A and B should contain A = [A1 A2...Ap] (an r rp matrix) % and B = [B1 B2 ... Bq] (an r rq matrix). The rn matrix X returns x(t) % in its t-th column. As in the scalar case the simulated series is spin-up % free, the starting values x(1),...,x(g) and eps(1),...,eps(g) being drawn % from the correct distribution. The model must be stationary. %% X = VARMA_SIM(...,mu) uses (2) for x(t) instead of (1a). %% X = VARMA_SIM(...,mu,M) returns M sequences in an rnM multidimensional X % (when r > 1) with the i-th sequence in X(1:r, 1:n, i). Use mu = [] to % obtain zero-mean. %% X = VARMA_SIM(...,mu,M,X0) initializes x(1),...,x(h) with the last h % columns of X0 instead of drawing from N(mu,SS). The model need not be % stationary. %% [X,EPS] = VARMA_SIM(...) returns also the shock series; the i-th eps(t) is % returned in EPS(:, t, i). %% For both ARMA and VARMA, use VARMA_SIM(A,[],...) for a pure autoregressive% model, and VARMA_SIM([],B,...) for a pure moving average model. %% The method used is described in [1] and [2]. % % [1] K Jonasson and SE Ferrando 2006. Efficient likelihood evaluation for % VARMA processes with missing values. Report VHI-01-2006, Engineering % Research Institute, University of Iceland. % % [2] K Jonasson 2006. Matlab programs for complete and incomplete data exact % VARMA likelihood and its gradient. Report VHI-02-2006, Engineering% Research Institute, University of Iceland. % % Kristjn Jnasson, Dept. of Computer Science, University of Iceland, 2006. % [email protected].

function [x, eps] = varma_sim(A, B, Sig, n, mu, M, x0)r = size(Sig, 1);if isempty(A), A = zeros(r,0); end if isempty(B), B = zeros(r,0); end p = size(A, 2)/r;q = size(B, 2)/r;Aflp = reshape(flipdim(reshape(A,r,r,p), 3), r, r*p); % [Ap...A2 A1] Bflp = reshape(flipdim(reshape(B,r,r,q), 3), r, r*q); % [Bq...B2 B1] h = max(p,q);if n


12/35

10

assert(length(x0)>=h);x(I,:) = repmat(reshape(x0(:,end-h+1:end),[],1) - mup, 1, M);

end eps(I,:) = randnm(M, EE - CC'*(SS\CC))' + CC'*(SS\x(I,:));eps(r*h+1:end,:) = reshape(randnm((n-h)*M,Sig)', [], M); % shocks % eps = reshape(randnm(n*M,Sig)', [], M); % for testing Ip = r*(h-p)+1:r*h;Iq = r*(h-q)+1:r*h;I = r*h+1:r*h+r;for t=h+1:n % Generate the series

x(I,:) = eps(I,:) + Aflp*x(Ip,:) + Bflp*eps(Iq,:);I = I+r;Ip = Ip+r;Iq = Iq+r;

end if r==1 && M==1 % one ARMA sequence:

x = reshape(x,1,n) + mu;eps = reshape(eps,1,n);

elseif r==1 % several ARMA sequences: x = reshape(x,n,M) + mu;eps = reshape(eps,n,M);

else % one or more VARMA sequences in rnM array: x = reshape(x,r,n,M) + repmat(mu,[1,n,M]);eps = reshape(eps,r,n,M);

end end

function x = randnm(n,Sig,mu);% RANDNM Multivariate normal random vectors % X = RANDNM(N,SIG,MU) generates N random vectors from an r-variate normal % distribution with mean MU and covariance matrix SIG. MU should be an % r-vector. SIG is assumed to be positive semidefinite. It is factorized % using [R,p] = chol(SIG) and thus no check is made for that fact. X will % be Nr. X = RANDNM(N,S) uses mean 0. % % NOTE: The formula for del was found empirically to be approximately the % minimum necessary to ensure that Cholesky factorization of Sig+del*I % works. The while loop is to further ensure that. r=size(Sig,1);if nargin 3;if ~DIFF

L = tril(B);for i=1:size(B,1), L(i,i) = L(i,i)/2; end ALA = A'*L*A;D = ALA + ALA' + C;

else [m,n,nPar] = size(Bd);L = tril(B);Ld = Bd;for i=1:m

L(i,i,:) = L(i,i,:)/2;Ld(i,i,:) = Ld(i,i,:)/2; Ld(1:i-1,i,:) = 0;

end [LA, LAd] = atb_deriv(L,Ld,A,Ad);[ALA,ALAd] = atb_deriv(A,Ad,LA,LAd);D = ALA + ALA' + C;Dd = Cd + ALAd + permute(ALAd,[2,1,3]);

end end

% ATB_C Calculate A' B + C and optionally its derivatives % % D = ATB_C(A,B,C) calculates D = A' B + C. % % [D,Dd] = ATB_C(A,B,C,Ad,Bd,Cd) calculates also the derivative w.r.t. % parameter theta(i) in Dd(:,:,i), i = 1,...,nPar. Ad(:,:,i), Bd(:,:,i) and % Cd(:,:,i) should be the derivatives of A, B and C w.r.t. theta(i). % % Dd = ATB_C(A,B,Ad,Bd,Cd) calculates only the derivatives. % % When C is zero use ATB_C(A,B), and ATB_C(A,B,Ad,Bd) % % METHOD: Dd(:,:,i) = Ad(:,:,i)'*B + A'*Bd(:,:,i) + Cd.

function [varargout] = atb_c(A, B, varargin) DIFF = nargin > 3;if DIFF

switch nargincase 4, [Ad,Bd] = deal(varargin{:}); C = []; Cd = [];case 5, [Ad,Bd,Cd] = deal(varargin{:}); C = [];case 6, [C,Ad,Bd,Cd] = deal(varargin{:});otherwise error( 'illegal call' );

end [k,m] = size(A);n = size(B,2);nPar = size(Ad,3);Dd = reshape(A'*reshape(Bd,k,n*nPar), m, n, nPar) + ...

permute(reshape(B'*reshape(Ad,k,m*nPar), n, m, nPar),[2,1,3]);if ~isempty(Cd), Dd = Dd + Cd; end varargout{nargout} = Dd;

end if ~DIFF || nargout==2

if nargin==3 || nargin==6, C = varargin{1}; else C = []; end D = A'*B;if ~isempty(C), D = D + C; end varargout{1} = D;

end end

% ATB_DERIV Calculate A' B and its derivative% % [C, Cd] = ATB_DERIV(A, Ad, B, Bd) calculates C = A' B and the derivative of C % with respect to parameter theta(i) in Cd(:,:,i), i = 1,...,nPar. Ad(:,:,i) % and Bd(:,:,i) should be the derivatives of A and B w.r.t. theta(i). % % Use Cd = ATB_DERIV(A, Ad, B, Bd) when only the derivative is needed. % % METHOD: Cd(:,:,i) = Ad(:,:,i)'*B + A'*Bd(:,:,i).

function [varargout] = atb_deriv(A, Ad, B, Bd)[k,m] = size(A);n = size(B,2);nPar = size(Ad,3);BA = reshape(B'*reshape(Ad,k,m*nPar), n, m, nPar);BA = permute(BA, [2,1,3]);Cd = reshape(A'*reshape(Bd,k,n*nPar), m, n, nPar) + BA;if nargout == 1,

varargout{1} = Cd;else

[varargout{1:2}] = deal(A'*B, Cd);end

end

% BACK_SUB_DERIV Derivative of back substitution % % Xd = BACK_SUB_DERIV(L, Ld, X, Yd) finds the derivative of the solution X to % an upper triangular system L' X = Y. Ld and Yd should have derivatives of L % and Y, and Xd returns the derivative of X. % % METHOD: From L' X = Y it follows that Ld(:,:,i)' X + L' Xd(:,:,i) = Yd, and % thus Xd may be obtained with back substitution.

function Xd = back_sub_deriv(L, Ld, X, Yd)LTT.LT = true; LTT.TRANSA = true;[n, m] = size(X);nPar = size(Ld,3);LdX = permute(reshape(reshape(Ld, [n,n*nPar])'*X, [n,nPar,m]), [1,3,2]);RHS = reshape(Yd - LdX, [n,m*nPar]);Xd = reshape(linsolve(L,RHS,LTT), [n,m,nPar]);

end


13/35


14/35

12

function [C,G,W] = find_CGW(A,B,Sig)[p,q,r] = get_dimensions(A,B,Sig);I = eye(r);A = makecell(A);B = [{I} makecell(B)];C = cell(1,q+1);G = cell(1,q+1);W = cell(1,q+1);BSig = cell(1,q+1);C{1} = Sig;for j=0:q

BSig{j+1} = B{j+1}*Sig;C{j+1} = BSig{j+1};for i=1:min(j,p)

C{j+1} = C{j+1} + A{i}*C{j-i+1};end

end for j = 0:q

G{j+1} = BSig{j+1};W{j+1} = BSig{j+1};for i = j+1:q

G{j+1} = G{j+1} + B{i+1}*C{i-j+1}';W{j+1} = W{j+1} + BSig{i+1}*B{i-j+1}';

end end

end

% FIND_CGW_DERIV Determine G, W and their derivatives % % [CCd, GGd, WWd] = find_CGW_deriv(A, B, Sig) returns three struct arrays, CCd % = [C0...Cq], GGd = [G0...Gq] and WWd = [W0...Wq] where each element is a % structure of a matrix and its derivatives w.r.t. all the parameters as % described in mds_add_prod. Thus CCd(t).mat contains Ct and CCd(t).der{k}{l,c} % contains the derivatives of Ct w.r.t. the (l,k) element of the k-th parameter % matrix (of A1..Ap B1..Bq Sig).

function [C, G, W] = find_CGW_deriv(A, B, Sig)[p, q, r] = get_dimensions(A, B, Sig);A = makecell(A);B = makecell(B);np = p+q+1;for j=1:q

BSig(j) = mds_set_zero(np, r, r);BSig(j) = mds_add_prod(BSig(j), B{j}, Sig, p+j, p+q+1);C(j) = BSig(j);for i=1:min(j-1, p)

C(j) = mds_add_prod(C(j), A{i}, C(j-i), i);end if j 1;r = size(C{1},1);p = size(A,2)/r;q = size(B,2)/r;m = n-p+q;% CHANGE A FROM [A1..Ap] to [Ap..A1] A = cell2mat(fliplr(makecell(A))); if isempty(A), A = zeros(r,0); end if isempty(B), B = zeros(r,0); end % FIND Ccol = [C0;...; C(q); C(q+1);...; C(n-p+q)] Ccol = cat(1, C{:}, zeros(r*(n-p-1),r));Gcol = zeros(r*(n-p-1), r);if DIFF

nPar = size(Cd{1},3);Ccold = cat(1, Cd{:}, zeros(r*(n-p-1), r, nPar));Gcold = zeros(r*(n-p-1), r, nPar);

end for j = q+1:m-1

J = j*r+1 : (j+1)*r;K = max(0,j-p)*r+1 : j*r;if j


15/35

13

SUd = reshape(Scold,r,n,r,nPar);SUd = flipdim(permute(SUd, [3,2,1,4]), 2);SSd = cat(1, reshape(SUd,r*n,r,nPar), Scold(r+1:end,:,:));Smd = zeros(M, M, nPar);

end % % CONSTRUCT Sm Sm = zeros(M, M);ib = r*(n-1)+1; ie = r*(2*n-1);for t=1:n

if rm(t)~=0K = km(t)+1 : km(t+1);I = ib:ie;J = miss(:,t);Sm(:,K) = SS(I(miss), J);if DIFF, Smd(:,K,:) = SSd(I(miss), J, :); end

end ib = ib - r;ie = ie - r;

end end

% FIND_V Find V-matrix (and its derivative) for missing value likelihood % % V = FIND_V(G, Gcol, Scol, miss) finds the matrix V as the observed rows % and missing columns of the matrix Lam S which is given by: % % S0 S1'... Sn-1' S0 S1'... Sn-1' % S1 S0 : S1 S0 :% : ... S1' : ... S1' % Sp-1... S0 Sp-1... S0% ---------------------- or ------------------- % Gq ... Gp-n+1 Gp ... Gp-n+1 % Gq : : : % .. Gq% Gq...G0 .. % Gq... G0 % % (depending on whether p or q is bigger). G, Gneg and Scol should be % {G0...Gq} (from find_CGW), [G(p-n+1);...;G(-1)] (from find_Gneg) and % [S0;...; S(n-1)] (from vyw_solve and S_extend). miss(i,t) is false if X(i,t) % is missing. % % [V, Vd] = FIND_V(G, Gneg, Scol, miss, Gd, Gnegd, Scold) finds also the % derivative of V. Gd and Scold should be (cell) arrays with the derivatives % of G and Scol.

function [V,Vd] = find_V(G, Gneg, Scol, miss, Gd, Gnegd, Scold)DIFF = nargout > 1;[r,n] = size(miss);obs = ~miss;[ko,ro,km,rm] = find_missing_info(miss);N = ko(n+1);M = km(n+1);p = n - size(Gneg,1)/r - 1;q = length(G) - 1;mV = findmaxnz(rm, ko, p, q, n);V = zeros(mV, M);if DIFF

nPar = size(Scold,3);Vd = zeros(mV, M, nPar);

end % Make SS = [S(n-1)';...; S2'; S1'; S0;... S(p-1)]: SU = reshape(Scol,r,n,r); SU = flipdim(permute(SU, [3,2,1]), 2);SS = [reshape(SU,r*n,r); Scol(r+1:r*p,:)]; GG = cat(1,Gneg,G{:});if DIFF

SUd = reshape(Scold,r,n,r,nPar);SUd = flipdim(permute(SUd, [3,2,1,4]), 2);SSd = cat(1, reshape(SUd,r*n,r,nPar), Scold(r+1:r*p,:,:));GGd = cat(1,Gnegd,Gd{:});

end for t=1:n

if rm(t)~=0LamS = [

SS((n-t)*r+1 : (n-t+p)*r, miss(:,t));GG((n-t)*r+1 : end, miss(:,t))];

tmax = min(n,max(p,t+q));mY = ko(tmax+1);K = km(t)+1:km(t+1);V(1:mY, K) = LamS(obs(:,1:tmax), :);if DIFF

LamSd = cat(1, ... SSd((n-t)*r+1 : (n-t+p)*r, miss(:,t), :), ... GGd((n-t)*r+1 : end, miss(:,t), :));

Vd(1:mY, K, :) = LamSd(obs(:,1:tmax), :, :);end

end end

end

function maxnz = findmaxnz(rm, ko, p, q, n)% return maximum non-zero row-number of V tmax = find(rm,1, 'last' ); % last missing time if isempty(tmax), tmax=0; end ; % in case nothing is missing trow = min(n, max(p, tmax+q)); % time of last nonzero row maxnz = ko(trow + 1); % number of observations until that time

end

% FORWARD_SUB_DERIV Derivative of forward substitution % % [X, Xd] = FORWARD_SUB_DERIV(L, Ld, Y, Yd) solves a lower triangular system % L X=Y and returns the derivative of X in Xd. % % Xd = FORWARD_SUB_DERIV(L, Ld, X, Yd) finds the derivative if X is given. % % The parameters are:

% L nn A lower triangular matrix % Ld nnnPar Ld (:,:,i) contains derivatives of L with respect to a % parameter theta(i), i=1...,nPar % X nm Solution to L X = Y % Yd nmnPar Derivatives of Y. May also be nnPar when m=1 % Xd nmnPar Returns derivatives of X. Is nnPar if Yd is nnPar. % % METHOD: % From L X = Y it follows that Ld(:,:,i) X + L Xd(:,:,i) = Yd, and thus Xd % may be obtained with forward substitution.

function [varargout] = forward_sub_deriv(L, Ld, varargin)LT.LT = true;if nargout == 2

[Y, Yd] = deal(varargin{:});X = linsolve(L, Y, LT);

else [X, Yd] = deal(varargin{:});

end [n, m] = size(X);nPar = size(Ld, 3);LdX= reshape(reshape(permute(Ld, [1,3,2]), n*nPar, n)*X, n,nPar, m);RHS = reshape(Yd - permute(LdX, [1,3,2]), [n,m*nPar]);Xd = reshape(linsolve(L,RHS,LT), [n,m,nPar]);if nargout == 2, varargout = {X, Xd}; else varargout = {Xd}; end

end

% DIMENSIONS Dimensions of VARMA model % % [p,q,r] = GET_DIMENSIONS(A, B, Sig) returns the dimensions of the VARMA % model with parameters A = {A1...Ap}, B = {B1...Bp} and Sig (Ai are the AR % coefficients, Bi are the MA coefficients and Sig is the covariance matrix of % the shock series. % % [p,q,r,n] = GET_DIMENSIONS(A, B, Sig, X) returns also the length of the % time series (the number of rows in X) in n.

function [p, q, r, n] = get_dimensions(A, B, Sig, X)r = size(Sig,1);if iscell(A), p = length(A); else p = size(A,2)/r; end if iscell(B), q = length(B); else q = size(B,2)/r; end if nargin>3

assert(size(X,1)==r);n = size(X,2);

end end

% IS_STATIONARY Check VARMA model for stationarity% % IS_STATIONARY(A, Sig) returns true if the VARMA-process x(t) = A1 x(t-1) + % ... + Ap x(t-p) + y(t) is stationary, false otherwise. A is the r r p % matrix [A1...Ap]. The time series y(t) is given by a moving average process, % y(t) = B1 y(t-1) + ... + Bq y(t-q) + eps(t), where eps(t) is N(0,Sig). % Whether x is stationary or not does not depend on B, so it need not be a % parameter. % % IS_STATIONARY(A, Sig, PLU) uses PLU from a previous call to vyw_factorize. % % The test is based on solving the modified Yule-Walker equations for the AR % process x(t)=A1 x(t-1)+...+Ap x(t-p)+eps(t) with eps(t) ~ N(0,I), but the % solution to these equations gives a postive definite Su matrix (see % omega_build) if and only if all the roots of the polynomial fi(b) = I - A1 b % - A2 b^2 - ... - Ap B^p are outside the unit circle, which characterizes a % stationary VARMA process.

function stat = is_stationary(A, Sig, PLU)[R,pp] = chol(Sig);if pp>0, stat=false; return , end ;if isempty(A), stat=true; return , end r = size(A,1);p = length(A)/r;if nargin==2

PLU = vyw_factorize(A);if ~isempty(PLU) && ~isempty(PLU{1}) && PLU{1}(1) == 0

stat=false; return

end end I = eye(r);S = vyw_solve(A, PLU, {I});Su = omega_build(S, {I}, {I}, p, p);[R, pp] = chol(Su');stat = pp==0;

end

% LAMBDA_BUILD Build the matrix Lambda % % LAMBDA = LAMBDA_BUILD(A, r, n) builds the n r n r matrix: % % Lambda = I % I % I % I % -Ap ... -A2 -A1 I % -Ap -A2 -A1 I % ... .. ..% -Ap ... -A1 I % % where each Ai is rr and A is a block row vector of the Ai, A = [A1...Ap]. % % [LAM, LAMD] = LAMBDA_BUILD(A, r, n, nPar) builds also the derivative of % Lambda. % % This function is useful for testing purposes.

function [Lam, Lamd] = lambda_build(A, r, n, nPar)A = makecell(A);p = length(A);Lam = mat2cell(eye(r*n), repmat(r,n,1), repmat(r,n,1));for i=p+1:n

for j=1:pLam{i,i-j} = -A{j};

end end Lam = cell2mat(Lam);if nargout > 1

assert(nPar >= r*r*p); % at least the elements of A Lamd = zeros(r, n, r, n, r, r, p);for i=p+1:n

for j=1:p

for l=1:rfor c=1:rLamd(l,i,c,i-j,l,c,j) = -1;

end end

end end Lamd = cat(3, reshape(Lamd, r*n, r*n, r*r*p), zeros(r*n,r*n,nPar-r*r*p));

end end


16/35

14

% LAMBDA_MULTIPLY Multiply with Lambda % % Y = LAMBDA_MULTIPLY(A, X, false(r,n)) returns Lam X in Y, where Lam is an n r % n r lower block-band matrix given by: % % Lam = I % I % I % I % -Ap ... -A2 -A1 I % -Ap -A2 -A1 I % ... .. .. % -Ap ... -A1 I % % and X is an (n r)-element column vector. A should be [A1...Ap]. % % Y = LAMBDA_MULTIPLY(A, X, miss) where miss is r n and true in missing % locations multiplies with Lam(obs,obs) where obs are the observed indices in % {1,2,...,r

n}. X should be an N-vector, where N is the number of observed % values. This call is used for multiplying with the matrix Lam_o. % % [Y,Yd] = LAMBDA_MULTIPLY(A,X,miss,Xd) returns Lam X in Y and the derivatives % of it in Yd, when Xd contains the derivatives of X w.r.t. all the parameters. % Yd is an N 1 nPar 3-dimensional array.

function [Y, Yd] = lambda_multiply(A, X, miss, Xd)DIFF = nargout > 1;[r, n] = size(miss);MISS = any(miss(:));p = size(A,2)/r;Ac = makecell(A);if MISS

obs = ~miss;Xf = zeros(r*n,1);Xf(obs) = X;X = reshape(Xf, r, n);[ko, ro, km] = find_missing_info(miss);N = ko(n+1);

else X = reshape(X, r, n);

end Y = X;if DIFF

nPar = size(Xd,3);if MISS

Xfd = zeros(n*r, nPar);Xfd(obs,:) = Xd;

else Xfd = Xd;

end I = any(Xfd,1); % nonzeros in nPar direction nI = sum(I);Xfd = reshape(Xfd, r, n, nPar);Zd = permute(Xfd(:,:,I),[1,3,2]);Xd = reshape(Zd, r*nI, n);L = p*nPar+1 : n*nPar;

end L = p+1:n;for k = 1:p

K = p+1-k:n-k;Y(:, L) = Y(:, L) - Ac{k}*X(:, K);if (DIFF)

YdIL = reshape(Zd(:,:,L), r, nI*(n-p));YdIL = YdIL - Ac{k}*reshape(Xd(:,K), r, nI*(n-p));Zd(:,:,L) = reshape(YdIL, r, nI, n-p);

end end Y = Y(:);if MISS, Y = Y(obs); end if (DIFF)

Yd = zeros(r, n, nPar);Yd(:,:,I) = permute(reshape(Zd, r, nI, n), [1,3,2]);for k=1:p

j0 = r^2*(k-1);for j = 1:r

J = j0 + (j:r:r^2); lJ = length(J);Yd(j, p+1:n, J) = Yd(j, p+1:n, J) - reshape(X(:,p-k+1:n-k)', 1, n-p,lJ);

end end Yd = reshape(Yd, n*r, 1, nPar);if MISS, Yd = Yd(obs, 1, :); end

end end

% LAMBDA_T_MULTIPLY Multiply vector with transpose of Lambda % % Y = LAMBDA_T_MULTIPLY(A, X, r, n) returns Lam' X in Y, where Lam is an % n rn r lower block-band matrix given by: % % Lam = I % I % I % I % -Ap ... -A2 -A1 I % -Ap -A2 -A1 I % ... .. ..% -Ap ... -A1 I % % and X is a vector with compatible dimensions. A should be {A1...Ap}.% % Y = LAMBDA_T_MULTIPLY(A, X, r, n, miss) multiplies with Lam(obs,obs)' where % obs are the observed indices in {1,2,...,r n}. This call is used for % multiplying with the matrix Lam_o'.function Y = lambda_T_multiply(A, X, r, n, miss)

if nargin < 5, miss = false(r,n); end A = makecell(A);p = length(A);ko = find_missing_info(miss);Y = X;

for t=p+1:nK = find(~miss(:,t));K1 = ko(t)+1 : ko(t+1);for k = 1:p

L = find(~miss(:,t-k)); L1 = ko(t-k)+1 : ko(t-k+1);Y(L1) = Y(L1) - A{k}(K,L)'*X(K1);

end end

end

% LOGDET_L Logarithm of determinant of a lower triangular matrix % % ld = LOGDET_L(L) returns log(det(L)) and [ld, ldd] = LOGDET_L(L, Ld) finds in % addition the derivative of ld with respect to several parameters. If L is % nn, Ld should be nnnPar and contain the derivative of L (nPar is the % number of parameters)

function [ld, ldd] = logdet_L(L, Ld);ld = sum(log(diag(L)));if nargin>1

nPar = size(Ld,3);ldd = zeros(1,nPar);for j=1:nPar

ldd(j) = sum(diag(Ld(:,:,j))./diag(L));end

end end

% MAKECELL Change to cell array % % C = MAKECELL(C) changes C from [C1 C2...Cm] to {C1 C2...Cm} where each Ci is % square. If C is empty the empty cell array, {}, is returned.

function Cout = makecell(C)if isempty(C), Cout = {}; return , end r = size(C,1);m = size(C,2)/r;Cout = cell(1, m);for i=1:m

Cout{i} = C(:, (i-1)*r+1:i*r);end

end

% MDS_ADD_PROD Add product to matrix-function and update derivative % % FSNEW = MDS_ADD_PROD(FS,X,Y,kX,kY) calculates F + X Y and its derivatives % with respect to all the parameter matrices of a VARMA model, [A1...Ap B1...Bq % Sig]. X should be the kX-th parameter matrix and Y the kY-th one. FS is a % structure with the following fields: % mat The rr matrix F% der A (p+q+1)-element cell-array with the derivatives of F w.r.t. to the % parameter matrices; der{k} is an rr cell matrix with derivatives % with respect to the k-th parameter matrix, P, and der{k}{l,c} is an % rr matrix with the derivative of F with respect to P(l,c). % spcod Vector of sparsity codes. The sparsity code has one of the values: % 'z' FS.der{k} is all zeros % 'e' FS.der{k}{i,j}(i,j) is the only nonzero in FS.der{k}{i,j} % 'r' The nonzeros of FS.der{k}{i,j} are all in its i-th row % 'c' The nonzeros of FS.der{k}{i,j} are all in its j-th column % 'x' The nonzeros of FS.der{k}{i,j} are all in its i-th column % 'f' All cells in FS.der{k} are full matrices % F + X Y and its derivative are returned in FSNEW which is a structure as FS. % % FSNEW = MDS_ADD_PROD(FS,X,Y,'T',kX,kY) calculates F + X Y' and its % derivatives. FS, X and Y are as described above. % % FSNEW = MDS_ADD_PROD(FS,X,A,kX) calculates F + X A and its derivaties for a % constant matrix A. In this case A need not have r columns. % % FSNEW = MDS_ADD_PROD(FS,X,GS,kX) calculates F + X G and its derivatives. FS % and GS are structures as described above and X is the kX-th parameter matrix. % G need not have r columns (and if it does not it must have sparsity code 'z', % 'r' or 'f', and now der{k} is rsize(G,2)). % % FSNEW = MDS_ADD_PROD(FS,X,GS,'T',kX) calculates F + X G' and its derivatives. % FS, GS and X are as above. G need not have r rows, and if it does not, must % have sparsity code 'f' or 'z', and now der{k} is rsize(G,1).

function FSNEW = mds_add_prod(FS,X,Y,varargin)N = length(FS.der);if ischar(varargin{1}), transp= 'T' ; varargin(1) = []; else transp= '' ; end kX = varargin{1};if isstruct(Y)

GS = Y;% global cover % for k=1:N % i = 2 + (k==kX) + 2*isequal(transp,'T'); % j = 2 + strfind('zercxf',GS.spcod(k)); % cover(i,j)='x'; % end cod = [ 'FpXG' transp];

elseif nargin >= 5cod = [ 'FpXY' transp]; kY = varargin{2};assert(kX~=kY);

else cod = 'FpXA' ;kY = 0;

end r = size(X,1);der = FS.der;spcod = FS.spcod;% fprintf('Cod=%s kX=%d\n', cod, kX); switch cod

case 'FpXA' F = FS.mat + X*Y;for c = 1:r

for l = 1:rder{kX}{l,c}(l,:) = der{kX}{l,c}(l,:) + Y(c,:);

end end spcod(kX) = setsp(spcod(kX), 'r' );

case 'FpXY' F = FS.mat + X*Y;for c = 1:r

for l = 1:rder{kX}{l,c}(l,:) = der{kX}{l,c}(l,:) + Y(c,:);der{kY}{l,c}(:,c) = der{kY}{l,c}(:,c) + X(:,l);

end end spcod(kX) = setsp(spcod(kX), 'r' );spcod(kY) = setsp(spcod(kY), 'c' );

case 'FpXYT' F = FS.mat + X*Y';for c = 1:r


17/35

15

for l = 1:rder{kX}{l,c}(l,:) = der{kX}{l,c}(l,:) + Y(:,c)';der{kY}{l,c}(:,l) = der{kY}{l,c}(:,l) + X(:,c);

end end spcod(kX) = setsp(spcod(kX), 'r' );spcod(kY) = setsp(spcod(kY), 'x' );

case 'FpXG' F = FS.mat + X*GS.mat;Gd = GS.der;for k = 1:N

% fprintf('GS.spcod(%d)=%s\n',k,GS.spcod(k)) for c = 1:r

for l = 1:rswitch GS.spcod(k)

case 'f' , der{k}{l,c} = der{k}{l,c} + X*Gd{k}{l,c};case 'r' , der{k}{l,c} = der{k}{l,c} + X(:,l)*Gd{k}{l,c}(l,:);case 'c' , der{k}{l,c}(:,c) = der{k}{l,c}(:,c) + X*Gd{k}{l,c}(:,c);case 'x' , der{k}{l,c}(:,l) = der{k}{l,c}(:,l) + X*Gd{k}{l,c}(:,l);case 'e' , der{k}{l,c}(:,c) = der{k}{l,c}(:,c) + X(:,l);case 'z' % do nothing

end end

end switch GS.spcod(k)

case { 'f' , 'r' }, spcod(k) = setsp(spcod(k), 'f' );case { 'c' , 'e' }, spcod(k) = setsp(spcod(k), 'c' );case { 'x' }, spcod(k) = setsp(spcod(k), 'x' );

end end for c = 1:r

for l = 1:rder{kX}{l,c}(l,:) = der{kX}{l,c}(l,:) + GS.mat(c,:);


case 'FpXGT' F = FS.mat + X*GS.mat';Gd = GS.der;for k = 1:N

% fprintf('GS.spcod(%d)=%s\n',k,GS.spcod(k)) for c = 1:r

for l = 1:rswitch GS.spcod(k)

case 'f' , der{k}{l,c} = der{k}{l,c} + X*Gd{k}{l,c}';case 'c' , der{k}{l,c} = der{k}{l,c} + X(:,c)*Gd{k}{l,c}(:,c)';case 'x' , der{k}{l,c} = der{k}{l,c} + X(:,l)*Gd{k}{l,c}(:,l)';case 'r' , der{k}{l,c}(:,l) = der{k}{l,c}(:,l)+X*Gd{k}{l,c}(l,:)';case 'e' , der{k}{l,c}(:,l) = der{k}{l,c}(:,l)+X(:,c);case 'z' % do nothing

end end

end switch GS.spcod(k)

case { 'r' , 'e' }, spcod(k) = setsp(spcod(k), 'x' );case { 'f' , 'c' , 'x' }, spcod(k) = setsp(spcod(k), 'f' );case { 'z' }, % do nothing

end end for c = 1:r

for l = 1:rder{kX}{l,c}(l,:) = der{kX}{l,c}(l,:) + GS.mat(:,c)';


end FSNEW.mat = F;FSNEW.der = der;FSNEW.spcod = spcod;

end

function spcod = setsp(spcod, s)% If s~=spcod and either both are in rcx or both are in ex then spcod:='f'.% Otherwise if s is >spcod in the ordering z


18/35

16

function [Y,Yd] = omega_ar_trisolve(Lu, LSig, Y, jpat, ko, code, Lud, LSigd, Yd) % Note: Xd and Xd are stored in Y and Yd OPTS.LT = true;if code(1)== 'T' , OPTS.TRANSA = true; end n = length(ko)-1;p = n - length(jpat);mY = size(Y,1);tmax = find(ko == mY, 1, 'first' ) - 1; assert(~isempty(tmax));e = size(Lu,1);[mY,nY] = size(Y);Y(1:e, :) = linsolve(Lu, Y(1:e, :), OPTS);for t=p+1:tmax

K = ko(t)+1 : min(mY, ko(t+1));Y(K,:) = linsolve(LSig{jpat(t-p)}, Y(K,:), OPTS);

end if nargout > 1,

nPar = size(Yd, 3);Lud = reshape(permute(Lud, [1,3,2]), e*nPar, e);Yd(1:e,:,:) = Yd(1:e,:,:)-permute(reshape(Lud*Y(1:e,:),e,nPar,nY), [1,3,2]);if isempty(Yd), Xd = Yd; return ; end for t=p+1 : tmax

K = ko(t)+1 : min(mY, ko(t+1));ro = ko(t+1) - ko(t);Lj = reshape(permute(LSigd{jpat(t-p)}, [1,3,2]), ro*nPar, ro);if code(1)== 'T'

Yd(K,:,:) = Yd(K,:,:) - permute(reshape(Lj'*Y(K,:),ro,nPar,nY),[1,3,2]);else

Yd(K,:,:) = Yd(K,:,:) - permute(reshape(Lj*Y(K,:),ro,nPar,nY),[1,3,2]);end

end Yd(1:e, :) = linsolve(Lu, Yd(1:e, :), OPTS);for t=p+1:tmax

K = ko(t)+1 : min(mY,ko(t+1));Yd(K,:) = linsolve(LSig{jpat(t-p)}, Yd(K,:), OPTS);

end Yd = reshape(Yd, mY, nY, nPar);

end end

% OMEGA_BACK_SUB Back substitution with Cholesky factor of Omega or Omega_o % % X = OMEGA_BACK_SUB(Lu, Ll, Y, p, q, ko) solves L_o' X = Y where p and q are % the dimensions of the ARMA model and Lo is the Cholesky factor of Omega_o, % stored in Lu and Ll (as returned by omega_factor). ko(t) is the number of % observed values before time t.% % Y may have fewer rows than L_o (which has N rows, N being the total number of % observations) and then L_o(1:k,1:k) X = Y is solved, where k is the row count % of Y. This will save computations. There is the restriction that Y must end

% on a block boundary (i.e. k must be equal to ko(j) for some j), and that it % must have at least ko(p+1) rows. % % [X,Xd] = OMEGA_BACK_SUB(Lu, Ll, Y, p, q, ko, Lud, Lld, Yd) finds in addition % the derivative of X. Again Y (and Yd) may have less than N rows.

function [X,Xd] = omega_back_sub(Lu, Ll, Y, p, q, ko, Lud, Lld, Yd)DIFF = nargout>1;N = ko(end);k = size(Y,1);tmax = find(ko == k, 1, 'first' ) - 1; assert(~isempty(tmax));X = Y;if DIFF

nPar = size(Lld,3);Xd = Yd;

end if isempty(X) || tmax 0, je = je-r; end K = K + r;

end end

% OMEGA_BUILD_DERIV Build derivatives of matrix Omega %% [Sud,Olowd] = OMEGA_BUILD_DERIV(Sd,Gd,Wd,n) returns an h rh rnPar array Sud % and an (n-h)r(q+1)rnPar array Olowd with the derivatives of Su and Olow % with respect to all parameters. Sd, Gd and Wd are the derivatives of the % Si, Gi and Wi matrices.

function [Sud, Olowd] = omega_build_deriv(Sd,Gd,Wd,p,n)q = length(Gd) - 1;h = max(p,q);r = size(Gd{1},1);Sd = cell2mat(fliplr(Sd(1:p)));Gd = cell2mat(fliplr(Gd));Wd = cell2mat(fliplr(Wd));nPar = size(Gd,3); Sud = zeros(r*h, r*h, nPar);Olowd = zeros(r*(n-h), r*(q+1), nPar);K = 1:r;for t = 1:p

Sud(K, 1:t*r, :) = Sd(:, end-t*r+1:end, :);K = K + r;

end J = (q-p)*r+1 : q*r;for t = p+1:h

Sud(K, 1:p*r, :) = Gd(:, J, :);Sud(K, p*r+1:t*r, :) = Wd(:, end-(t-p)*r+1:end, :);J = J - r;K = K + r;

end K = 1:r;je = min(p,q)*r;for t = 1:n-h

Olowd(K, 1:je, :) = Gd(:, 1:je, :);Olowd(K, je+1:end, :) = Wd(:, je+1:end, :);if je > 0, je = je-r; end K = K + r;

end end


19/35

17

% OMEGA_FACTOR Cholesky factorization of Omega % % [Lu,Ll,info] = omega_factor(Su,Olow,p,q,ko) calculates the Cholesky % factorization Omega = L L' of Omega which is stored in two parts, a full % upper left partition, Su, and a block-band lower partition, Olow, as returned % by omega_build. Omega is symmetric, only the lower triangle of Su is % populated, and Olow only stores diagonal and subdiagonal blocks. On exit, L = % [L1; L2] with L1 = [Lu 0] and L2 is stored in block-band-storage in Ll. Info % is 0 on success, otherwise the loop index resulting in a negative number % square root. P and q are the dimensions of the problem and ko is a vector % with ko(t) = number of observed values before time t. % % In the complete data case ko should be 0:r:n*r. For missing values, Su and % Olow are the upper left and lower partitions of Omega_o = Omega with missing % rows and columns removed. In this case Lu and Ll return L_o, the Cholesky % factor of Omega_o.

function [Lu,Ll,info] = omega_factor(Su,Olow,p,q,ko)n = length(ko)-1;h = max(p,q);ro = diff(ko);Ll = zeros(size(Olow));[Lu,info] = chol(Su'); % upper left partition if info>0, return ; end Lu = Lu';e = ko(h+1); % order of Su for t = h+1 : n % loop over block-lines in Olow

K = ko(t)+1 : ko(t+1); KL = K - ko(t-q);JL = 1 : ko(t)-ko(t-q);tmin = t-q; tmax = t-1;Ll(K-e, JL) = omega_forward(Lu, Ll, Olow(K-e,JL)', p, q, ko, tmin, tmax)';[Ltt, info] = chol(Olow(K-e,KL) - Ll(K-e,JL)*Ll(K-e,JL)');if info>0, info = info + ko(t); return ; end Ll(K-e, KL) = Ltt';

end end

% OMEGA_FACTOR_DERIV Derivative of Cholesky factorization of Omega % % [Lud,Lld] = OMEGA_FACTOR_DERIV(Sud, Olowd, Lu, Ll, p, q, ko) calculates the % derivative of the Cholesky factorization of Omega. On entry Sud should be an % r hr hnPar array with the derivatives of Su and Olowd should be an % r (n-h)r (q+1) array with the derivatives of Olow where Su and Olow are the % upper left and lower partitions of Omega as returned by omega_build. % Moreover, Lu and Ll should be as returned by omega_factor and p and q are the % dimensions of the model. ko is an (n+1)-vector with ko(t) = number of % observed values before time t. On exit Lud and Lld return the derivatives of % Lu and Ll. %% In the missing value case, Sud and Olowd are from omega_remove_miss.

function [Lud, Lld] = omega_factor_deriv(Sud, Olowd, Lu, Ll, p, q, ko)n = length(ko) - 1;h = max(p, q);Lud = chol_deriv(Lu, Sud); % Derivative of upper left partition: % % Find derivative of lower part: e = size(Lu, 1);Lld = zeros(size(Olowd));for t = h+1 : n % loop over block-lines in Olow

K = ko(t)+1 : ko(t+1); KL = K - ko(t-q);JL = 1 : ko(t)-ko(t-q);U = Ll(K-e, JL);Yd = permute(Olowd(K-e, JL, :), [2,1,3]);Ud = omega_forward_deriv(Lu, Ll, Lud, Lld, U', Yd, p, q, ko, t-q, t-1);Ud = permute(Ud, [2,1,3]);Lld(K-e, JL, :) = Ud;UUdT = aat_deriv(U, Ud);Lld(K-e, KL, :) = chol_deriv(Ll(K-e, KL), Olowd(K-e, KL, :) - UUdT);

end end

% OMEGA_FORWARD Forward substitution with Cholesky factor of Omega or Omega_o % % X = OMEGA_FORWARD(Lu, Ll, Y, p, q, ko) solves L X = Y where p, q and r are % the dimensions of the ARMA model and L is the Cholesky factor of Omega, % stored in Lu and Ll (as returned by omega_factor). ko(t) is the number of % observed values before time t. In the missing value case, Lu and Ll store the % matrix L_o (see omega_factor). % % When Y = [O; Y1] where O is a zero matrix with ko(tmin) rows, the calculation % may be speeded using the call X1 = OMEGA_FORWARD(Lu, Ll, Y1, p, q, ko, tmin). % The X1 returned will contain rows ko(tmin)+1...ko(n+1) of X (the first % ko(tmin) rows being known to be zero). Moreover, to use only the first tmax % block rows of L, use X1 = OMEGA_FORWARD(Lu, Ll, Y1, p, q, ko, tmin, tmax), % with Y1 containing block-rows tmin to tmax. This call syntax is used in % omega_factor.

function X = omega_forward(Lu, Ll, Y, p, q, ko, tmin, tmax)LT.LT = true;if nargin


20/35

18

% OLOW. To find LU use the partitioning of A: % % |LU' M' | . |LU | = |SU B'| % | L2'| |M L2| |B A2| % % which implies that LU' LU = SU - M' M.

function [Lu,A,info,Lud,Ad] = omega_ltl(Su,Olow,p,q,ko,Sud,Olowd)DIFF = nargin>5;LTT.LT = true; LTT.TRANSA = true;n = length(ko)-1;h = max(p,q);e = ko(h+1); % order of Su Lu = Su;A = Olow;if DIFF

nPar = size(Sud,3);Lud = Sud;Ad = Olowd;

end for k=n:-1:h+1

K = (ko(k)+1 : ko(k+1)) - e;c = ko(k-q);for i = min(n,k+q):-1:k

b = ko(i-q);JD = K + e - c;JU = (b+1 : ko(k)) - c;J = K + e - b;if i>k

I = (ko(i)+1 : ko(i+1)) - e;JUD = (b+1 : ko(k+1)) - c;JMY = 1 : ko(k+1)-b;A(K,JUD) = A(K,JUD) - A(I,J)'*A(I,JMY);if (DIFF)

JM = 1 : ko(k)-b;Ad(K,JD,:) = Ad(K,JD,:) - ata_deriv(A(I,J), Ad(I,J,:));Ad(K,JU,:)=Ad(K,JU,:)-atb_deriv(A(I,J),Ad(I,J,:),A(I,JM),Ad(I,JM,:));

end else

[A(K,JD), info] = ltl(A(K,JD));if (info~=0) return ; end A(K,JU) = linsolve(A(K,JD), A(K,JU), LTT);if (DIFF)

Ad(K,JD,:) = ltl_deriv(A(K,JD), Ad(K,JD,:));Ad(K,JU,:) = back_sub_deriv(A(K,JD),Ad(K,JD,:),A(K,JU),Ad(K,JU,:));

end end

end end for i = h+1:min(n,h+q)

b = ko(i-q);M = b+1:e;I = (ko(i)+1 : ko(i+1)) - e;JM = 1 : ko(k)-b;Lu(M,M) = Lu(M,M) - A(I,JM)'*A(I,JM);if (DIFF) Lud(M,M,:) = Lud(M,M,:) - ata_deriv(A(I,JM), Ad(I,JM,:)); end

end [Lu, info] = ltl(Lu);if (DIFF) Lud = ltl_deriv(Lu, Lud); end

end

function [L,info] = ltl(A)% Calculates LTL-factorization of A. Only uses lower triangle of A. Returns % info=0 on success. n = size(A,1);[L,info] = chol(A(n:-1:1,n:-1:1));if info~=0, L=A; return ; end L = L(n:-1:1,n:-1:1);

end

function Ld = ltl_deriv(L, Ad)% Returns the derivative of the LTL-factorization of A in Ld. L should be the % LTL-factor of A and Ad the derivative of A. n = size(L,1);Ld = chol_deriv(L(n:-1:1, n:-1:1)', permute(Ad(n:-1:1, n:-1:1, :), [2,1,3]));Ld = permute(Ld(n:-1:1, n:-1:1, :), [2,1,3]);

end

% OMEGA_REMOVE_MISS Remove rows and columns of missing values from Omega % % [Suo, Olowo] = OMEGA_REMOVE_MISS(Su, Olow, miss) removes rows and columns % given by the vector miss(:) from Omega, keeping only rows and columns % corresponding to observed values. Omega is stored in two parts, Su and Olow % as returned by omega_build, and the updated Omega is similarly returned in % two parts, a full upper left partition Suo and a sparse block-band lower part % Olowo (which will in general have variable sized blocks). The rn logical % matrix miss should be true in missing locations, thus specifiying the % structure of Olowo. % % [Suo,Olowo,Suod,Olowod] = OMEGA_REMOVE_MISS(Su,Olow,miss,Sud,Olowd) also % updates the derivative of Omega, stored in the three dimensional arrays Sud % and Olowd as calculated by omega_build_deriv.

function [Suo, Olowo, varargout] = omega_remove_miss(Su, Olow, miss, Sud, Olowd)[r, n] = size(miss);ko = find_missing_info(miss);DIFF = nargout > 2;h = size(Su,1)/r;q = size(Olow,2)/r - 1;observed = ~miss;obs1 = observed(:,1:h);obs2 = observed(:,h+1:end);Suo = Su(obs1, obs1); % copy Su-partit. (also OK because of how Matlab works) if DIFF

Suod = Sud(obs1, obs1, :);Olowod = zeros(size(Olowd));

end % REMOVE COLUMNS FROM LOWER PART(S):

ro = diff(ko);J = 1 : r;K = h+1-q : h+1;Olowo = zeros(size(Olow));for t = 1:n-h

obsk = observed(:,K);m = sum(ro(K));Olowo(J, 1:m) = Olow(J, obsk(:));if DIFF, Olowod(J, 1:m, :) = Olowd(J, obsk(:), :); end J = J+r;

K = K+1;end Olowo = Olowo(obs2, :); % remove rows if DIFF

Olowod = Olowod(obs2, :, :);[varargout{1:2}] = deal(Suod, Olowod);

end end

% PROFILE_AR_TRISOLVE Pure AR Omega triangular solve for profile-sparse matrix % % X = PROFILE_AR_TRISOLVE(Lu, LSig, Y, JPAT, ko, km, p, g, 'N') solves Lo X=Y % exploiting known sparsity in Y, which is such that Y = Yf(obs,miss) and the % (i,j)-block of Yf is known to be zero for i > max(p,j+g). This function is % used to form Vhat (with g=0) and Laomh (with g=p) in the pure autoregressive % case. Lu, LSig and JPAT come from omega_ar_factor and give the ltl-factor of % Omega_o, ko(t) is the number of observed and km(t) the number of missing % values before time t. X = PROFILE_AR_TRISOLVE(...,'T') solves Lo'

X=Y. % % [X,Xd] = PROFILE_AR_TRISOLVE(...,Lud,Lld,Yd) finds also the derivative of X. % % NOTE: The routine is implemented using a blocked algorithm for speed

function [X,Xd] = profile_ar_trisolve(Lu,LSig,Y,JPAT,ko,km,p,g,cod,Lud,LSigd,Yd) DIFF = nargout > 1;n = length(ko) - 1;M = km(n+1);blk = ceil(25*n/(M+1));X = Y;if DIFF, Xd = Yd; end mY = size(Y,1);for t1=1:blk:n

t = min(n,t1+blk-1);if km(t+1) > km(t1)

tmax = min(n,max(p,t+g));I = 1 : min(mY, ko(tmax+1));K = km(t1)+1:km(t+1);if ~DIFF

X(I,K) = omega_ar_trisolve(Lu, LSig, Y(I,K), JPAT, ko, cod);else

[X(I,K),Xd(I,K,:)] = ... omega_ar_trisolve(Lu, LSig, Y(I,K), JPAT, ko,cod,Lud,LSigd,Yd(I,K,:));

end end

end end

% PROFILE_ATA Multiply A transposed with A where A is a profile-sparse matrix % % C = PROFILE_ATA(A, ko, km, p, g) sets C to A' A making use of known sparsity % in A which is such that A = Af(obs,miss) and the (i,j)-block of Af is known % to be zero for i > max(p,j+g). This function is used to form Vhat' Vhat % (with g=q) and Laomh' Laomh (with g=p) efficiently. The vectors ko and km % specify the sparsity structure, ko(t) is the number of observed and km(t) % the number of missing values before time t. % % [C,Cd] = PROFILE_ATA(A, ko, km, p, g, Ad) finds also the derivative of C.

function [C,Cd] = profile_ata(A, ko, km, p, g, Ad);n = length(ko)-1;DIFF = nargout>1;mA = size(A,1);M = size(A,2);C = zeros(M,M);if DIFF, nPar = size(Ad,3); Cd = zeros(M,M,nPar); end blk = ceil(25*n/(M+1));for t1=1:blk:n

t = min(n,t1+blk-1);if km(t+1)>km(t1)

tmax = min(n,max(p,t+g));I = 1 : min(mA, ko(tmax+1));K = km(t1)+1:km(t+1);L = 1 : km(t+1);J = 1 : km(t1);C(K,L) = C(K,L) + A(I,K)'*A(I,L);if DIFF

if ~isempty(J)Cd(K,J,:) = Cd(K,J,:) + atb_deriv(A(I,K),Ad(I,K,:), A(I,J),Ad(I,J,:));

end Cd(K,K,:) = Cd(K,K,:) + ata_deriv(A(I,K), Ad(I,K,:));

end end

end % Make upper triangular part: C = tril(C)+tril(C,-1)';if DIFF

for k=1:size(Cd,3), Cd(:,:,k) = tril(Cd(:,:,k))+tril(Cd(:,:,k),-1)'; end end

end

% PROFILE_BACK_SUB Omega back substitution for a profile-sparse matrix % % X = PROFILE_BACK_SUB(Lu, Ll, Y, ko, km, p, q, g) solves Lo' X = Y exploiting % known sparsity in Y, which is such that Y = Yf(obs,miss) and the (i,j)-block % of Yf is known to be zero for i > max(p,j+g). This function is used to form % Vhat (with g=q) and Laomh (with g=p). Lo is the ltl-factor of Omega_o (from % omega_ltl), stored in Lu and Ll, and ko(t) is the number of observed and % km(t) the number of missing values before time t.. % % [X,Xd] = PROFILE_BACK_SUB(Lu, Ll, Y, ko, km, p, q, g, Lud, Lld, Yd) finds % also the derivative of X.

% NOTE: The routine is implemented using a blocked algorithm for speed

function [X,Xd] = profile_back_sub(Lu, Ll, Y, ko, km, p, q, g, Lud, Lld, Yd)DIFF = nargout > 1;

n = length(ko) - 1;M = km(n+1);blk = ceil(25*n/(M+1));X = Y;if DIFF, Xd = Yd; end mY = size(Y,1);for t1=1:blk:n

t = min(n,t1+blk-1);if km(t+1) > km(t1)

tmax = min(n,max(p,t+g));


21/35

19

I = 1 : min(mY, ko(tmax+1));K = km(t1)+1:km(t+1);if ~DIFF

X(I,K) = omega_back_sub(Lu, Ll, Y(I,K), p, q, ko);else

[X(I,K),Xd(I,K,:)] = ... omega_back_sub(Lu, Ll, Y(I,K), p, q, ko, Lud, Lld,Yd(I,K,:));

end end

end end

% PROFILE_MULT Multiply with a profile-sparse matrix % % C = PROFILE_MULT(A, B, ko, km, p, g, h) sets C to A'*B making use of known % sparsity in A and B which is such that A = Af(obs,miss) and the (i,j)-block % of Af is known to be zero for i > max(p,j+g) and B = Bf(obs,miss) with the % (i,j)-block of Bf zero for i > max(p,j+h). When B is full set h = n. This % function is used to multiply efficiently with Vhat' (g=q) and Laomh' (g=p). % The matrix P = Laomh' Vhat is formed with g=p, h=q. ko(t) is the number of % observed and km(t) the number of missing values before time t. % % [C,Cd] = PROFILE_MULT(A,B, ko, km, p, g, h, Ad,Bd) finds also the derivative % of C.

function [C,Cd] = profile_mult(A, B, ko, km, p, g, h, Ad, Bd);n = length(ko)-1;DIFF = nargout>1;[k1,l] = size(A); [k2,M] = size(B);m = min(k1, k2);C = zeros(l,M);if DIFF, nPar = size(Ad,3); Cd = zeros(l,M,nPar); end blk = ceil(25*n/(M+1));if h km(t1)

tmax = min(n,max(p,t+g));I = 1 : min(m, ko(tmax+1));K = km(t1)+1:km(t+1);C(K,:) = C(K,:) + A(I,K)'*B(I,:);if DIFF

XX = atb_deriv(A(I,K),Ad(I,K,:), B(I,:),Bd(I,:,:));Cd(K,:,:) = Cd(K,:,:) + XX;

end end

end end

end

% RES_MISS Estimate residuals and missing values%% EPS = RES_MISS(A, C, Lu, Ll, w) returns the maximum likelihood % estimate of the residuals of the VARMA model: % % x(t) - mu = A1 (x(t-1) - mu) + ... + Ap (x(t-p) - mu) + y(t) % with % y(t) = eps(t) + B1 eps(t-1) + ... + Bq eps(t-q), % % where eps(t) is N(mu, Sig), mu is an r-vector, Sig is rr and the A's and % B's are the coefficients of the model. Lu, Ll, C and w are as in varma_llc. % % [EPS, XM] = RES_MISS(A,C,Luo,Llo,wohat,LR,LQ,K,u,v,Laomh,Vhat,Sm,miss) % is used for the missing value case. The paramteres are as in varma_llm, and % in addition to the EPS estimate, the maximum likelihood estimate of the % missing values is returned in XM.

function [eps,xm] = res_miss(varargin);if nargin==5

[A, C, Lu, Ll, w] = deal(varargin{:});r = size(C{1}, 1);p = length(A)/r;q = length(C) - 1;n = length(w)/r;what = omega_forward(Lu, Ll, w, p, q, 0:r:r*n);

y = omega_back_sub(Lu, Ll, what, p, q, 0:r:r*n);y = lambda_T_multiply(A, y, r, n);eps = C_multiply(C, y, n);xm = [];

else [A, C, Luo, Llo, wohat, LR,LQ,K,u,v,Laomh,Vhat,Sm,miss] = deal(varargin{:});[r, n] = size(miss);p = length(A)/r;q = length(C) - 1;ko = find_missing_info(miss);

LTT.LT = true; LTT.TRANSA = true;v1 = linsolve(LQ, v, LTT);v2 = linsolve(LR, u + K*v1, LTT);% Following is because Vhat and Laomh are not stored in full if there are % no missing values at the end of the time series. k1 = size(Vhat,1);k2 = size(Laomh,1);y = wohat;y(1:k1) = y(1:k1) + Vhat*(v1 - v2);y(1:k2) = y(1:k2) + Laomh*(Sm*v2);y = omega_forward(Luo, Llo, y, p, q, ko);y = lambda_T_multiply(A, y, r, n, miss);eps = C_multiply(C, y, n, miss);xm = Sm*v2;

end eps = reshape(eps, r, n);

end

% RES_MISS_AR Estimate residuals and missing values for AR-model%% [EPS] = RES_MISS_AR(A,Sig,Lu,LSig,z) returns the maximum likelihood estimate % of the residuals of the VAR model: % % x(t) - mu = A1 (x(t-1) - mu) + ... + Ap (x(t-p) - mu) + eps(t) % % where eps(t) is N(mu, Sig), mu is an r-vector, Sig is rr and the A's are % the coefficients of the model (p>0). The input parameters are the variables % with the same names in var_llc. % % [EPS] = RES_MISS_AR(A,Sig,Lu,LSig,woh,LR,LQ,K,u,v,Laomh,Vhat,Sm,miss) is % used for the missing value case. [EPS, XM] = FIND_AR_RES_MISS(...) returns % also the maximum likelihood estimate of missing values.

function [eps,xm] = res_miss_ar(A,Sig,Lu,LSig,z,LR,LQ,K,u,v,Laomh,Vhat,Sm,miss);LTT.LT = true; LTT.TRANSA = true;C = find_CGW(A, [], Sig);if nargin == 5

r = size(Sig, 1);p = length(A)/r;n = length(z)/r;jpat = ones(1,n-p);y = omega_ar_trisolve(Lu, LSig, z, jpat, 0:r:n*r, 'T' );y = lambda_T_multiply(A, y, r, n);miss = zeros(r, n);if nargout>1, xm = []; end

else [r, n] = size(miss);p = size(A,2)/r;ko = find_missing_info(miss);[pat, ipat, jpat] = unique(miss(:, p+1:end)', 'rows' );ko = find_missing_info(miss);v1 = linsolve(LQ, v, LTT);v2 = linsolve(LR, u + K*v1, LTT);% Following is because Vhat and Laomh are not stored in full if there are % no missing values at the end of the time series. k1 = size(Vhat,1);k2 = size(Laomh,1);y = z;y(1:k1) = y(1:k1) + Vhat*(v1 - v2);y(1:k2) = y(1:k2) + Laomh*(Sm*v2);y = omega_ar_trisolve(Lu, LSig, y, jpat, ko, 'T' );y = lambda_T_multiply(A, y, r, n, miss);if nargout>1, xm = Sm*v2; end

end eps = C_multiply(C, y, n, miss);eps = reshape(eps, r, n);

end

% S_BUILD Calculate covariance of x % % SS = S_BUILD(S,A,G,n) finds the covariance matrix of the vector x = % [x1'...xn']' of all the values of a VARMA time series, % % x(t) = A1 x(t-1) + ... + Ap x(t-p) + y(t) % where % y(t) = eps(t) + B1 eps(t-1) + ... + Bq eps(t-q), % % and x(t), y(t) and eps(t) are r-dimensional with eps(t) N(0,Sig). S =% {S0,S1...Sp}, with Sj = Cov(x(t),x(t-j)) and G = {G0 G1...Gq} with Gj = % Cov(y(t),x(t-j)) should be previously calculated with find_CGW. A should % contain [A1...Ap]. SS is the matrix: % % S0 S1' S2'...Sn-1' % S1 S0 S1'...Sn-2' % S2 : % : : % Sn-1 ...... S1 S0 % % and the Sj are found with the recurrence relation: % % Sj = A1*S(j-1) + A2*S(j-2) + ........ + Ap*S(j-p) + Gj % % with Gj = 0 for j > q.

function SS = S_build(S, A, G, n)r = size(G{1},1);p = length(S) - 1;q = length(G) - 1;S = [S cell(1,n-p-1)];A = makecell(A);for j=p+1:n-1

if j


22/35

20

SS(i+1:end,i) = Sc(1:n-i);end SS = cell2mat(SS);

end

% S_EXTEND Extend S0...Sp from Yule-Walker equations to S0...S(n-1) % % Scol = S_EXTEND(A, G, S, n) when A = [A0...Ap], G = {G0,...,Gq} and S = % {S0,...,Sp} calculates S(p+1)...S(n-1) and returns them together with % S0,..., Sp in a block column vector Scol = [S0; S1;...; S(n-1)]. Thus Scol % will be n r r. % % [Scol, Scold] = S_EXTEND(A, G, S, n, Gd, Sd) returns also the n r r nPar % derivatives of Scol.

function [Scol,Scold] = S_extend(A, G, S, n, Gd, Sd)DIFF = nargout > 1;r = size(G{1}, 1);p = size(A,2)/r;q = length(G)-1;A = cell2mat(fliplr(makecell(A))); % change A from [A1..Ap] to [Ap..A1] if isempty(A), A = zeros(r,0); end Scol = cat(1, S{:}, zeros(r*(n-p-1),r));if DIFF

nPar = size(Sd{1},3);Scold = cat(1, Sd{:}, zeros(r*(n-p-1),r,nPar));

end pp = p+1;K = r+1:r*pp;I = r*pp+1:r*(pp+1);for j = p+1:n-1

if j p

S{p+1} = Y{p+1};


23/35

21

else S{p+1} = zeros(r,r,nrhs);

end for k=1:nrhs

for i=1:pS{p+1}(:,:,k) = S{p+1}(:,:,k) + A{i}*S{p-i+1}(:,:,k);

end end

end

% XMU_DERIV Derivatives of observed components of x minus mu % % XMUD = XMU_DERIV(NPAR, MISS) returns derivatives of the observed components % of x - repmat(mu,n,1) w.r.t. all NPAR parameters, A1...Ap, B1...Bq, Sig, mu % (with matrices taken column by column using only the lower triangle of Sig). % NPAR should be (p+q+1) r^2 + r (r+3)/2 and MISS(i,t) should be true if the % value X(i,t) is missing. Only the derivatives w.r.t. mu are nonzero.

function xmud = xmu_deriv(nPar, miss)[r, n] = size(miss);[ko, ro] = find_missing_info(miss);obs = ~miss;nobs = sum(obs(:));xmud = zeros(nobs, 1, nPar);for t=1:n

J = find(~miss(:,t));for i=1:ro(t)

xmud(ko(t)+i, 1, nPar-r+J(i)) = -1;end

end end

APPENDIX B. TEST PROGRAMS

% TEST_PRIMARY Script to run primary tests of Vauto package:

disp 'TESTING PRIMARY VAUTO FUNCTIONS VAR_LL, VARMA_LLC, VARMA_LLM ANDVARMA_SIM' test_varma_llctest_varma_llc_derivtest_var_lltest_var_ll_jactest_varma_llmtest_varma_jactest_varma_llm_derivtest_varma_sim quiet disp( 'ALL PRIMARY TESTS SUCCEEDED' )

% TEST_ALL Script to run all tests for Vauto package:

disp 'RUNNING PRIMARY AND SECONDARY TESTS FOR VAUTO PACKAGE' % TEST ROUTINES NEEDED FOR COMPLETE-DATA LOG-LIKELIHOOD: test_der2array_mds_settest_find_CGWtest_vyw quiet test_omega_buildingtest_omega_forwardtest_omega quiet test_varma_llctest_parmatprod quiet % Derivative routines: test_find_CGW_derivtest_vyw_deriv quiet test_trisolve_derivtest_chol_deriv quiet test_omega_deriv quiet test_varma_llc_deriv% % TEST MISSING VALUE ROUTINES: test_lambda_multiply quiet

test_find_Smtest_find_lambda_omtest_find_Vhattest_product_derivtest_omega_building miss test_atba_ctest_profiletest_profile_artest_C_multiplytest_omega_artest_var_lltest_var_ll_jactest_varma_llmtest_varma_jactest_varma_llm_deriv% % MISC TESTS test_varma_sim quiet % disp( 'ALL TESTS SUCCEEDED' )

% ALMOSTEQUAL Check if two arrays are almost equal

% % ALMOSTEQUAL(X, Y) returns true if X and Y are arrays of the same size and % with the same elements except for differences that may be attributed to % rounding errors. The maximum allowed relative difference is 5 10^(-14) and % the maximum allowed absolute difference is also 5 10^(-14). The relative % difference is measured relative to the element in X or Y with largest % absolute value.

function close = almosteqal(x,y)if iscell(x)

if ~iscell(y), close=false; return , end x = cell2mat(x);y = cell2mat(y);

end M = max([1; abs(x(:)); abs(y(:))]);close = isequal(size(x),size(y)) && all(abs(x(:)-y(:)) < 5e-14*M);

end

% DIFF_TEST Check analytic derivative against numerical derivative % % DMAX = DIFF_TEST(@FUN, X, PAR1, PAR2,...) checks the derivative of FUN at X. % FUN should be a function with header [F, G] = FUN(X, PAR1, PAR2,...) where

% X is an m-vector, F is an n-dimensional function value and G is an nm % matrix with the Jacobian of FUN at X. DMAX returns the maximum relative % difference between G(i,j) and a numerical derivative calculated with the % central difference formula. % % [DMAX,G,GNUM] = DIFF_TEST(...) returns also the G from FUN and the numerical % Jacobian. % % For efficiency it is possible to test a FUN that returns several functions

% and gradients in cell arrays F = {f1...fn} and G = {g1...gn} where fi and gi % are the function value and Jacobian of the i-th function. With this syntax % DMAX is a vector of the discrepancies.

function [dmax,g,gnum] = diff_test(fun, x, varargin)[f,g] = fun(x, varargin{:});if ~iscell(f), f={f}; g={g}; end Nf = length(f);n = length(x);eps = norm(x)*1e-6;xp = x;xm = x;for i=1:n

xp(i) = x(i)+eps;

xm(i) = x(i)-eps;fp = fun(xp, varargin{:});fm = fun(xm, varargin{:});if ~iscell(fp), fp = {fp}; fm = {fm}; end for j=1:Nf

gnum{j}(:,i) = (fp{j} - fm{j})/(2*eps);end xp(i) = x(i);xm(i) = x(i);

end for j=1:Nf

assert(isequal(size(g{j}), size(gnum{j})));assert(all(isfinite(g{j}(:))));assert(all(isfinite(gnum{j}(:))));gmax = max(max(abs(g{j})));if isempty(gnum{j}) && isempty(g{j})

dmax(j) = 0;elseif gmax


24/35

22

function [args,varargout] = getflags(args,varargin)assert(nargout==nargin || nargout == 3 && nargin ==2);n = length(args);IC = false(n,1);for i=1:n, IC(i) = ischar(args{i}); end ;Keep = true(n,1);Jc = find(IC);for k=1:length(varargin)

Iq = strmatch(varargin{k},args(IC), 'exact' );Keep(Jc(Iq)) = false;varargout{k} = ~isempty(Iq);if nargout > nargin

if ~isempty(Iq) && length(args) > Jc(Iq),varargout{2} = args{Jc(Iq)+1};Keep(Jc(Iq)+1) = false;

else varargout{2} = [];

end

end end args = args(Keep);

end function [ll,lld] = loglik(theta, X, p, q, r, miss)

[A, B, Sig, mu] = theta2mat(theta, p, q, r);if nargin=0 i% of the values (chosen at random) missing (i


25/35


26/35


27/35

25

nPar = r^2*(p+q) + r*(r+1)/2;% % Call the function once (useful for debug): miss = false(r, n); miss([1 3]) = true;[f,g] = sommfun(mat2theta(A, B, Sig), p, q, r, n, miss);% % First try with nothing missing: Sm = find_Sm(Scol, false(r,n));assert(isempty(Sm));% % Check sizes of derivatives: [CCd, GGd, WWd] = find_CGW_deriv(A, B, Sig);RHS = vyw_deriv_rhs(A, GGd, S);Sd = vyw_solve(A, LUvyw, RHS);[G, Gd] = der2array(GGd);[Scol,Scold] = S_extend(A, G, S, n, Gd, Sd);miss = false(r,n); miss([1,3,4]) = true;[Sm, Smd] = find_Sm(Scol, miss, Scold);

assert(isequal(size(Smd), [3, 3, nPar]));% % Now two cases with ca. 25% missing: for k = 1:2

miss = false(r,n);miss(unique(floor(r*n*rand(round(r*n/4),1)) + 1)) = true;Sm = find_Sm(Scol, miss);assert(almostequal(Sm, SS(miss,miss)));

end % % Finally compare derivatives with numerical ones: [d,g,gnum] = diff_test(@sommfun, mat2theta(A, B, Sig), p, q, r, n, miss);assert(d


28/35

26

fprintf(out, 'Derivative test (1): %.1e\n' , d);assert(d


29/35

27

disp( ' OK' );end

function [f,g] = fun(theta,p,r,miss)nPar = r^2*p + r*(r+1)/2;[A, B, Sig] = theta2mat(theta, p, 0, r);PLU = vyw_factorize(A);S = vyw_solve(A, PLU, {Sig});[Lu, LSig, jpat] = omega_ar_factor(S, Sig, miss);f = vech(Lu);for i=1:length(LSig), f = [f; vech(LSig{i})]; end if nargout > 1

SigS = mds_set_parmat(p+1, Sig, p+1);RHS = vyw_deriv_rhs(A, SigS, S);Sd = vyw_solve(A, PLU, RHS);Sigd = der2array(SigS);[Lu,LSig,jpat,Lud,LSigd] = omega_ar_factor(S,Sig,miss,Sd,Sigd{1});g = vech(Lud);

for i=1:length(LSig), g = [g; vech(LSigd{i})]; end end end

% TEST_OMEGA_BUILDING Test the functions omega_build and omega_remove_miss % % TEST_OMEGA_BUILDING prints out OK (three times) if omega_build works ok for % three test cases, all with r=2, p=3, n=7 but q is set to 2, 3 and 4 % respectively. It may also be useful for debugging. % % TEST_OMEGA_BUILDING QUIET runs more quietly % % TEST_OMEGA_BUILDING MISS tests also omega_remove_miss

function test_omega_building(varargin)[varargin,quiet] = getflags(varargin, 'quiet' );[varargin,miss] = getflags(varargin, 'miss' );out = double(~quiet && ~miss);S0=[3 4; 8 8];S1=[3 4; 1 1];S2=[3 4; 2 2];S3=[3 4; 3 3];

G0=[4 5; 8 8];G1=[4 5; 1 1];G2=[4 5; 2 2];G3=[4 5; 3 3];G4=[4 5; 4 4];W0=[5 6; 8 8];W1=[5 6; 1 1];W2=[5 6; 2 2];W3=[5 6; 3 3];W4=[5 6; 4 4];S={S0 S1 S2 S3};G={G0 G1 G2};W={W0 W1 W2};r=2; p=3; q=2; n = 7;if ~miss, fprintf( 'TESTING OMEGA_BUILD...' );else fprintf( 'TESTING OMEGA_BUILD AND OMEGA_REMOVE_MISS...' ); end fprintf(out, '\n test q0, x(1:N1, k, :) = reshape(Olow(K, 1:q*r, :), N1,1,[]); end x(N1+1:end, k, :) = vech(Olow(K, q*r+1:end, :));

end

x = reshape(x, N2*m, M);end

% TEST_OMEGA_FORWARD Check function omega_forward % % TEST_OMEGA_FORWARD calls omega_forward with a few testcases. Further testing % of omega_forward comes with test_omega.


30/35

28

function test_omega_forwardfprintf( 'PRELIMINARY TEST OF OMEGA_FORWARD AND OMEGA_FORWARD_DERIV...' )% TEST SIMPLE CASES WITH r=1 AND (p,q) = (1,0), (1,1), (0,1). % % TEST CASE 1: L = [5 0 0; 0 2 0; 0 0 2];Lu = 5;Ll = [2;2];p = 1;q = 0;nPar = 2;nY = 2;ko = [0 1 2 3];y = rand(3,nY);x = omega_forward(Lu, Ll, y, p, q, ko);assert(almostequal(x, L\y))Lud = ones(1,1,nPar);

Lld = ones(2,1,nPar);Lld(:,:,1)=Lld(:,:,1)/2;yd = ones(3,nY,nPar);xd = omega_forward_deriv(Lu, Ll, Lud, Lld, x, yd, p, q, ko);for i=1:nPar

Ldi = diag([Lud(1,1,i);Lld(:,1,i)]);assert(almostequal(L*xd(:,:,i), (yd(:,:,i) - Ldi*x)));

end if exist( 'omega_forward_deriv' )==3 % It is a dll-file, so try transpose also

ydt = permute(yd,[2,1,3]);tmin=1; tmax=3;xdt = omega_forward_deriv(Lu, Ll, Lud, Lld, x', ydt, p, q,ko,tmin,tmax, 'T' );diff = xd - permute(xdt,[2,1,3]);assert(norm(diff(:)) < 1e-14);

end % TEST CASE 2: q = 1;Ll = [1 2;1 2];L = [5 0 0; 1 2 0; 0 1 2];x = omega_forward(Lu, Ll, y, p, q, ko);assert(almostequal(x, L\y))% p = 0;

x = omega_forward(Lu, Ll, y, p, q, ko);assert(almostequal(L*x, y))% % TEST KMIN-KMAX FEATURE: y = [1;1];x = omega_forward(Lu, Ll, y, p, q, ko, 2);assert(almostequal([0;x], L\[0;y]));x = omega_forward(Lu, Ll, y, p, q, ko, 1, 2);assert(almostequal(x, L(1:2,1:2)\y))y = 3;x = omega_forward(Lu, Ll, y, p, q, ko, 1, 1);assert(almostequal(x, L(1,1)\y));x = omega_forward(Lu, Ll, y, p, q, ko, 2, 2);assert(almostequal(x, L(2,2)\y))x = omega_forward(Lu, Ll, y, p, q, ko, 3, 3);assert(almostequal(x, L(3,3)\y))% % NOW LET r=2 r = 2;O = zeros(r,r);Lu = chol(hilb(r))';L1 = ones(r,r);L0 = chol(gallery( 'minij' ,r))';

Ll = [L1 L0; L1 L0];ko = 0:r:3*r;L = [Lu O O; L1 L0 O; O L1 L0];nY = 1;Y = ones(3*r,nY);X = omega_forward(Lu, Ll, Y, p, q, ko); % p=1, q=1 assert(almostequal(X, L\Y))

% Check derivatives with r=2: nPar = 1;Lud = tril(ones(size(Lu)));L0d = tril(ones(r,r));L1d = ones(r,r);Lld = [L1d L0d; L1d L0d];Ld = [Lud O O; L1d L0d O; O L1d L0d];if nPar==2

Lud = cat(3, Lud, Lud/2);Lld = cat(3, Lld, Lld/2);Ld = cat(3,Ld,Ld/2);

end Yd = ones(3*r,nY,nPar);Xd = omega_forward_deriv(Lu, Ll, Lud, Lld, X, Yd, p, q, ko);for i=1:nPar

assert(almostequal(L*Xd(:,:,i), (Yd(:,:,i) - Ld(:,:,i)*X)));end

disp( 'OK' );end

% TEST_PARMATPROD Test parameter matrix product derivative functions % % TEST_PARMATPROD tests mds_set_zero, mds_set_parmat and mds_add_prod. Use % TEST_PARMATPROD QUIET to reduce output. %% For mds_add_prod(FS,X,GS,kX) and mds_add_prod(FS,X,GS,'T',kX) the tests are % designed to cover all possible combinations of spcod and transpose on G both % for derivative w.r.t. X and w.r.t. another parameter. To ascertain this,% uncomment the block referring to cover both in mds_add_prod and here.

function diffout = test_parmatprod(verbosity)% global cover % cover = [ % ' zercxf' % 'A ' % 'X ' % 'AT '

% 'XT ']; if nargin


31/35

29

function FS = setprod(nparmat,X,Y,varargin)r = size(X,1);if ~isstruct(Y), n=size(Y,2); else , n=size(Y.mat,2); end FS = mds_set_zero(nparmat,r,n);FS = mds_add_prod(FS,X,Y,varargin{:});

end

% TEST_PRODUCT_DERIV Test matrix product derivative functions % % TEST_PRODUCT_DERIV checks that aat_deriv, ata_deriv and atb_deriv, which% differentiate matrix products, work correctly.

function test_product_derivfprintf( 'TESTING ATB_DERIV, ATA_DERIV AND AAT_DERIV...' );A1 = rand(3,4);A2 = rand(3,4);

B1 = rand(4,5);B2 = rand(4,5);x = [0.74; 0.33];d = diff_test(@fun, x, A1, A2, B1, B2, 2);assert(all(d


32/35

30

m1 = r*r*p; % A m2 = r*r*q; % B m3 = r*(r+1)/2; % Sig n1 = ceil(m1/5); % thA n2 = ceil(m2/5); % thB n3 = 2; % thSig S1 = gallery( 'lehmer' ,r);S2 = gallery( 'minij' ,r);th = (rand(n1 + n2 + n3, 1))/(p+q)/r/r;Ja = rand(r^2*p, n1);Jb = rand(r^2*q, n2);

Documents

Algorithm Report