19
Clustering of financial time series This paper addresses the topic of classifying financial time series in a fuzzy framework proposing two fuzzy clustering models both based on GARCH models. Two distance measures Two cluster models based on GARCH models

Clustering of financial time series This paper addresses the topic of classifying financial time series in a fuzzy framework proposing two fuzzy clustering

Embed Size (px)

Citation preview

Clustering of financial time series

This paper addresses the topic of classifying financial time series in a fuzzy framework proposing two fuzzy clustering models both based on GARCH models.

• Two distance measures• Two cluster models based on GARCH models

GARCH models and their autoregressive (AR) representation

yt: a time series composed of a stochastic process μt and a zero mean heteroskedastic univariate process εtut: a univariate white noise process with zero mean and variance ht: a univariate stochastic process independent from ut

The GARCH(p, q) process is defined as:

2 can be described as an ARMA(p , q) process:∗ ηt is a zero mean error.After simple algebra, the squared disturbances 2 can be represented as:

Given the usual stationarity and invertibility restrictions on the roots of (1 − (α1 + β1)z − ・ ・ ・ − (αi + βi)zi − ・ ・ ・ − (αp + βp )zp) and (1 − β1z − ∗ ∗ ・ ・ ・ − βjzj − ・ ・ ・ − βqzq), 2 can be expressed as an infinite autoregressive AR(∞) model:

AR-based distance measure for comparing time series

For each pair of time series, ztk and ztk ′ , let

and be the vectors of the estimated parameters of their finite AR representation, e.g. AR(Rk) and AR(Rk ′ ) respectively.

squared Euclidean distance measure:

In this way, the AR coefficients replace the time series in the comparative assessment.

When the AR coefficient vectors representing each time series have unequal length we can adopt the zero-padding solution, by adding zeros defining a new AR vector with the same length as the longer one.

GARCH-based distance measure for comparing time series

• This distance metric, which takes into account the volatility, is based on both the estimated GARCH parameters and their estimated covariances.

• Define the vectors of the estimated parameters:

and the matrices :

Vx = of estimated covariances of the GARCH(p, q) representation of each pair of time series.

distance measure:

consider a pair Xt and Yt GARCH(1, 1) processes and their estimated covariance matrix:

带入公式

得:

GARCH-based fuzzy C-medoids clustering model (GARCH-FCMdC model)

Let Z = {zt1, zt2, . . . , zti, . . . , ztI } (t = 1, . . . , T , ) denote a set of I univariate financial time series. Considering the GARCH(p, q) representations of the series, the squared disturbances 2 can be represented as a finite AR process with

vector of the corresponding autoregressive coefficients.

Also, consider a subset of Z with cardinality C:

and the corresponding autoregressive coefficients of the squared disturbances 2 . Based on the AR-based distance we define the GARCH-based Fuzzy C-Medoids Clustering (GARCH-FCMdC) model for clustering financial time series:

where uic represents the fuzzy membership of the i-th AR process in the c-th cluster.

is the squared Euclidean distance between the i-th AR process and the c-th medoid AR process.

m > 1 is a weighting exponent that controls the fuzziness of the partition. As m increases, the membership degrees are fuzzier.

by means of the Lagrangian multiplier method, we get the local optimal solutions:

Two further issues concern the detection of both the optimal number of clusters C and the fuzziness parameter m. In particular, in our application we consider the Fuzzy Silhouette index. Set the value of m in the interval (1, 1.5].

A Prototypical Case of AR-based Distance and Clustering Model

Let Xt GARCH(1, 1) and Yt GARCH(1, 1) be two stochastic ∼ ∼processes. The AR-coefficients corresponding to the processes are:

The AR-distance becomes:

three convergent geometric series:

Therefore, we obtain:

The GARCH-FCMdC model can be rewritten as:

GARCH-based fuzzy C-medoids clustering model with Caiado distance (GARCH-FCMdCC model)

let Z = {zt1, zt2, . . . , zti, . . . , ztI } (t = 1, . . . , T ) be a set of I univariate financial time series,L = {L1, L2, . . . , Li, . . . , LI } be the corresponding vectors of estimated parameters of their GARCH(p, q) representationsandV = {V1, V, . . . , Vi, . . . , VI } be the set of the estimated covariances matrices, with Vi =

consider a subset of Z:

with estimated parameters and covariances matrices of the GARCH(p, q) representations

Then the GARCH-based fuzzy C-medoids clustering model with Caiado distance (GARCH-FCMdCC) can be written as:

is the Caiado distance between the i-th financial time series and the c-th medoid financial time series.

The local optimal solution is:

Application to dailies’ returns of Euro exchange rates

• Present and discuss the results of an empirical application of the proposed GARCH-based Fuzzy C-Medoids clustering models to the volatility of dailies Euro exchange rates against 29 international currencies.

• The aim of the analysis is to identify the exchange rates that show similar fluctuation in the volatility of dailies’ returns and thus to classify Euro exchange rates vs major international currencies according to their stability.