Jory Namrata Devi 2012 (1)

  • Upload
    vas

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    1/50

    A Comparison of Neural Network, ARIMA andGARCH Forecasting of Exchange Rate Data

    Namrata Devi Jory

    UNIVERSITY OF MAURITIUS

    Dissertation submitted to the Department of Mathematics, Faculty of Science,

    University of Mauritius, as a partial fulfilment of the requirement for the degree

    of BSc (Hons) Mathematics with Computer Science.

    April 2012

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    2/50

    Contents

    List of Figures iii

    List of Tables iv

    Acknowledgements v

    Abstract vi

    1 Introduction 1

    1.1 Forecasting Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.2 Aims and Organization of the Project . . . . . . . . . . . . . . . . . . . . . . . 3

    2 Artificial Neural Networks 4

    2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    2.2 Feedforward Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    2.3 Backpropagation Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . 7

    2.4 Training set and Testing set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    3 Linear and Non-linear Time Series 10

    3.1 Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    3.1.1 Autoregressive process . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    3.1.2 Moving Average Process . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    3.1.3 Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    3.2 ARIMA ( p, d, q ) Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    3.2.1 Autocorrelation and Partial Autocorrelation Functions . . . . . . . . 13

    i

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    3/50

    CONTENTS

    3.3 GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.3.1 ARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.3.2 GARCH model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    4 Analysis of Exchange Rate Data 16

    4.1 Analysis of Period I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    4.1.1 Analysis of Period II data . . . . . . . . . . . . . . . . . . . . . . . . . 21

    4.2 Forecasting Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    4.3 Fitting Neural Network Model to the Exchange Rates Data . . . . . . . . . . 23

    4.4 Fitting ARIMA Model to the Foreign Exchange Rates Data . . . . . . . . . . 244.5 Fitting the GARCH model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    4.6 Empirical findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    4.6.1 Forecasts performance of Period I . . . . . . . . . . . . . . . . . . . . 26

    4.6.2 Results and Discussions for Period I . . . . . . . . . . . . . . . . . . . 32

    4.6.3 Forecasts performance of Period II . . . . . . . . . . . . . . . . . . . . 34

    4.6.4 Results and Discussions for Period II . . . . . . . . . . . . . . . . . . 39

    5 Conclusion 41

    Bibliography 42

    ii

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    4/50

    List of Figures

    2.1 Three-Layer feedforward Network . . . . . . . . . . . . . . . . . . . . . . . . 5

    2.2 The neuron weight adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    4.1 Daily MUR/USD data and its corresponding first differences of the logs . . 17

    4.2 Daily MUR/EU data and its corresponding first differences of the logs . . . 17

    4.3 Daily MUR/GBP data and its corresponding first differences of the logs . . 18

    4.4 Histogram and Kernel Density estimate of daily log return of MUR/USD . 19

    4.5 Histogram and Kernel Density estimate of daily log return of MUR/GBP . 19

    4.6 Histogram and Kernel Density estimate of daily log return of MUR/EU . . 19

    4.7 Scatter plot of  yt against yt−1 for the MUR/USD log return data. . . . . . . . 20

    4.8 Scatter plot of  yt against yt−1 for the MUR/EU log return data. . . . . . . . . 20

    4.9 Scatter plot of  yt against yt−1 for the MUR/GBP log return data. . . . . . . . 20

    4.10 Daily MUR/USD Jan 03 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . . 21

    4.11 Daily MUR/EU Jan 03 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    4.12 Daily MUR/GBP Jan 02 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . . 22

    4.13 In-sample and out-of-sample forecast for MUR/USD . . . . . . . . . . . . . 33

    4.14 In-sample and out-of-sample forecast for MUR/EU . . . . . . . . . . . . . . 33

    4.15 In-sample and out-of-sample forecast for MUR/GBP . . . . . . . . . . . . . 34

    iii

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    5/50

    List of Tables

    3.1 Behaviour of ACF and PACF . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    4.1 Summary statistics for the daily exchange rates: log first difference . . . . . 18

    4.2 Summary statistics of log first difference daily exchange rate . . . . . . . . . 21

    4.3 In-sample performance of MUR/USD for data Jan 2003-Dec 2008 . . . . . . 26

    4.4 In-sample performance of MUR/EU for data Jan 2003-Dec 2008 . . . . . . . 27

    4.5 In-sample performance of MUR/GBP for data Jan 2002-Dec 2008 . . . . . . 28

    4.6 Out-of-sample performance of MUR/USD for data Jan 2003-Dec 2008 . . . . 29

    4.7 Out-of-sample performance of MUR/EU for data Jan 2003-Dec 2008 . . . . 30

    4.8 Out-of-sample performance of MUR/GBP for data Jan 2002-Dec 2008 . . . . 31

    4.9 First 10 forecasts values of MUR/GBP for ANN, ARIMA and GARCH models 31

    4.10 In-sample performance MUR/USD data . . . . . . . . . . . . . . . . . . . . . 34

    4.11 In-sample performance MUR/EU data . . . . . . . . . . . . . . . . . . . . . . 35

    4.12 In-sample performance MUR/GBP data . . . . . . . . . . . . . . . . . . . . . 36

    4.13 Out-of-sample performance MUR/USD data . . . . . . . . . . . . . . . . . . 37

    4.14 Out-of-sample performance of MUR/EU data . . . . . . . . . . . . . . . . . 38

    4.15 Out-of-sample performance of MUR/GBP data . . . . . . . . . . . . . . . . . 39

    4.16 In-sample and Out-of-sample forecasts of MUR/USD . . . . . . . . . . . . . 40

    4.17 In-sample and Out-of-sample forecasts of MUR/EU . . . . . . . . . . . . . . 40

    4.18 In-sample and Out-of-sample forecasts of MUR/GBP . . . . . . . . . . . . . 40

    iv

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    6/50

    Acknowledgements

    I am deeply indebted to my supervisor, Professor Muddun Bhuruth, for his excep-

    tional guidance and encouragement which helped me to carry out this research and write

    this dissertation. It was a wonderful experience to work with such a mentor and I am very

    grateful for all the criticisms to make my work better each time. I am also thankful to my

    co-supervisor Associate Professor Ravindra Boojhawon who has helped and motivated me

    throughout the project.

    I am very grateful to my parents for their love and for always being by my side in goodand bad times and special thanks to my dearest sister for her understanding and help. I

    would like to thank my friends khush, Prish and Bho for always being there for me.

    I also extend my gratitude to all those who helped me directly or indirectly to make this

    work a success.

    Finally, I thank GOD for his blessings from the bottom of my heart.

    v

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    7/50

    Abstract

    We consider linear and non linear models for forecasting exchange rates of the Mauri-

    tian Rupee against US Dollar, Euro and the British Pound. The linear models considered

    are the ARIMA processes and the non linear models considered are the Artificial Neural

    Networks and the GARCH model. Since no guidelines were available to choose the pa-

    rameters of the Neural Network, they were chosen through extensive experimentation.

    Two periods of analysis were carried out first from January 2002 to December 2008 and

    for January 2002 to December 2011 and the in-sample and out-of-sample forecasts were

    produced. The reason for this choice is that we wanted to test the ability of our forecastingprocedures during the financial crisis.

    Using three forecast evaluation criteria RMSE, MAE and MAPE, we found that the ARMA-

    GARCH model performs slightly better than the ARIMA model. The two mentioned mod-

    els give better performance when compared with the ANN model for the in-sample fore-

    casts of the first period data. However the ANN was found to outperform the ARIMA and

    ARMA-GARCH models in the out-sample forecasting for both periods data.

    vi

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    8/50

    Chapter 1

    Introduction

    Foreign exchange rates are among the most important determinants of the economic health

    of a country. They describe the price of one currency in terms of another one and they

    play a vital role in the trading relationship between countries, which in turn affect the

    world economy. For this reason, they are the most watched, analysed and governmen-

    tally manipulated economic measures. Understanding the evolution of exchange rates is

    important for many essential issues in international economics and finance, such as inter-

    national trade, capital flows, international portfolio management, currency option pricing,

    and foreign exchange risk management. The foreign exchange market has experienced

    many unexpected growth and downfall over the last few decades. The dynamics of the

    exchange market depend entirely on the exchange rates. Thus the appropriate prediction

    of exchange rates is a very crucial factor for the success of many businesses on the global

    market. The exchange market is in itself well known for being extremely unpredictable

    and volatile. A volatile exchange market makes international trade and investment deci-

    sions more difficult because volatility increases exchange rate risk which may result into a

    potential loss due to a change in the rates.Forecasting any time series accurately is very difficult and to add up to it, exchange

    rates prediction is one of the most challenging applications of modern time series forecast-

    ing. The exchange rates are generally noisy, non-stationary and deterministically chaotic

    (Yaser & Atiya 1996) which suggest that there is no past behaviour information which can

    produce a relation between the past and the future behaviour. However, though all the

    constraints mentioned, numerous techniques have been devised by researchers to forecast

    1

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    9/50

    1.1 Forecasting Models

    the exchange rates. And the search for a reliable model to predict exchange rates is still

    ongoing.

    1.1 Forecasting Models

    Forecasting foreign exchange rates is a very important issue in the economic world. During

    the past years, different models were developed using the linear and nonlinear framework.

    A linear model is one which can predict the future values of exchange rates by identifying

    and magnifying the existing linear structure in the data. And the most commonly used

    linear models for exchange rates forecasting are the Box and Jenkins’ Autoregressive Inte-

    grated Moving Average (ARIMA) models. One of the attractive features of Box and Jenkins

    approach to forecasting is that there is a very rich class of possible models and it is usually

    possible to find a process which provides an adequate description of the data. The ARIMA

    model is also a very powerful instrument for construction of accurate forecasts with small

    distance of forecasting. Since it is extremely popular, this model is used as a benchmark

    to evaluate new modelling approaches. ARIMA models are very effective techniques for

    forecasting when the dynamics of the time series is linear and stationary (Cao & Tay 2001).

    However, nonlinearities in exchange rates are supported by many evidences. Thus approx-

    imation by ARIMA models may not be adequate since it cannot capture nonlinear patterns

    in the exchange rates data.

    Ever since the inadequacy of linear models (Racine 2001) was observed, there has

     been considerable development in modeling nonlinearities. Thus nonlinear models such

    as the Autoregressive Conditional Heteroscedasticity (ARCH) model and the Generalized

    Autoregressive Conditional Heteroscedasticity (GARCH) model were developed. They

    have been found to be useful in capturing certain nonlinear features of financial time series

    such as clusters of outliers. (Bollerslev & Ghysels 1996) had shown successful applicationof GARCH model in describing the dynamic behavior of the exchange rates. However

    there exists no distinguished theory that can be applied to all such nonlinear models as

    they require specific assumptions concerning the precise form of nonlinearity. Moreover

    there exist too many possible nonlinear patterns in a particular data set, thus a specific

    nonlinear model may not be sufficient in capturing all of them.

    In response to all the constraints of the linear and nonlinear models, artificial neural

    2

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    10/50

    1.2 Aims and Organization of the Project

    networks (ANNs) have been used to forecast the exchange rates. ANNs resemble and op-

    erate in the same way as our biological neural system. Due to their unique non-parametric,

    non-assumable, noise-tolerant and adaptive properties (Haoffi & Han 2007), ANNs can

    deal better with non-stationary and volatile data. ANNs have flexible nonlinear function

    mapping which can approximate any continuous measurable function with arbitrary de-

    sired accuracy. They are found to have an upper hand on various traditional linear and

    nonlinear models in the exchange rates forecasting. Many researchers such as (Wang

    & Leu 1996), (Tang & Fishwich 1993), (T. Hill & Remus 1996) have shown that ANNs

    perform better than ARIMA models (linear models), especially for more irregular series

    and for multiple-period-ahead forecasting. (Gencay 1999) and (R. K. Bissoondeeal &

    Mootanah 2008) find that forecasts generated by neural network are superior to those of 

    ARIMA and GARCH models. (Panda & Narasimhan 2007) have shown that the perfor-

    mance of ANNs is better than the linear autoregressive model and random walk model

    for the one-step-ahead prediction, thus suggesting that there always exists a possibility of 

    forecasting exchange rate. However other studies have reported inconsistent results for

    example (W. R. Foster & Ungar 1992) have shown the ANNs are inferior to linear regres-

    sion and (Meade 2002) find no evidence that the foreign exchange rate behaviour is better

    represented by the ANNs than the linear model.

    1.2 Aims and Organization of the Project

    This project studies neural network methods for modeling the Mauritian Rupee (MUR)

    against the three most important currencies which are the US Dollar, the Euro and the

    British Pound. We focus on various neural network models and assess their performance

    for in-sample and out-of-sample forecasts and the results are compared against forecasts

    produced by ARIMA and GARCH models.The organization of the project is as follows: In Chapter 2, we discuss how the Artificial

    Neural Network works and learning algorithm used. Chapter 3 describes linear and non

    linear time series models. In Chapter 4, we analyse our data and compare the forecasting

    accuracies of the different models.

    3

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    11/50

    Chapter 2

    Artificial Neural Networks

    2.1 Introduction

    Artificial neural networks, which imitate the human brain’s ability to classify patterns and

    make prediction based on past experience, have found applications in different areas such

    as; financial forecasting, medical diagnostics, flight control and product inspection. The

    artificial neural networks have been widely used in applied forecasting due to their ability

    to model complex relationships between input and output variables and also because of 

    the presence of nonlinearities in many time series.

    2.2 Feedforward Network

    The feedforward neural network is the most commonly used network in applied work

    due to its capability of resolving a large number of problems. It consists of a considerable

    number of simple processing units known as neurons which are organised in layers. A

    feedforward neural network begins with an input layer which is connected to a hidden

    layer. The hidden layer can then be connected to another hidden layer or directly to the

    output layer. In this architecture data enters at the input layer and passes through the net-

    work layer by layer until it arrives at the output layer. The hidden layer is known to be

    a very important layer in the neural network since it is responsible to approximate a con-

    tinuous function and achieve the desired accuracy. And since a single hidden layer is less

    4

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    12/50

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    13/50

    2.2 Feedforward Network

    The hidden Layer Vector is  Z =(z1, z2, · · ·  , zk)T 

    Each hidden neuron has a bias α which is being added to the received weighted sum of all

    the inputs to form a net input. The bias may be viewed as simply being added to shift the

    function to the left by an amount  α. It is much like a weight, except that it has a constant

    input of  1.

    net input =n

    i=1

    wi,jxi   (2.2)

    The net input is the argument of the transfer function  f  which is applied to construct the

    output of a specific neuron.

    z j  = f 

    α +

    ni=1

    wi,jxi

    where j = 1, 2, · · ·   , k   (2.3)

    In the output layer, the output neuron receives the weighted sum of the processed signals

    obtained from the hidden layer neurons. Another function ϕ   is applied to produce the

    final output.

    y =  ϕ

    β  + k

     j=1

    µ jz j

      (2.4)

    Where ϕ is the transfer function, β  is the bias unit, µ j  is the weight from the hidden neuron

     j to the output unit.

    Replacing z j  in the equation 2.4, we get

    y =  ϕ

    β  + k

     j=1

    µ j f 

    α +

    ni=1

    wi,jxi

      (2.5)

    In our feedforward network, a hyperbolic tangent sigmoid transfer function is used in

    the hidden layer and a linear transfer function is used in the output layer. This is so be-

    6

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    14/50

    2.3 Backpropagation Learning Algorithm

    cause these two transfer function had proved to be successful earlier.

    The hyperbolic tangent sigmoid is given by  f (x) = tanh(x) and the linear function transfer

    function is f (x) =  x. A node’s transfer function serves the purpose of controlling the out-

    put signal strength for the node. These functions set the output signal strength between 1.0

    and −1.0. It was also noticed that the hyperbolic tangent sigmoid function can accelerate

    learning for some models and also have an impact on predictive accuracy. The learning

    rule commonly used in this type of network is the backpropagation algorithm.

    2.3 Backpropagation Learning Algorithm

    The process of learning is implemented by modifying the weights iteratively until the de-

    sired response is achieved at the output node. Backpropagation is known to be the most

    popular supervised learning algorithm. A supervised learning algorithm is one which ac-

    cepts input values, computes the output values and compares it with the desired output

    values, then adjusts the weights to minimize the deviance. This process is carried out until

    the network cannot further reduce the error.

    Figure 2.2: The neuron weight adjustment

    Backpropagation learning updates the network weights and biases in the direction in

    which the performance function decreases the most. The gradient is computed and the

    weights are updated after each input in the increment mode. Backpropagagtion starts at

    7

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    15/50

    2.4 Training set and Testing set

    the output layer with the following equations:

    wij  = w

    ij  + l.e j .xi   (2.6)

    Where ith input of the j th neuron, wij  is the weight,  w

    ij   is the previous weight value,  l  is

    the learning rate, e j  is the error term and xi is the ith input.

    The backpropagation algorithm looks for the global error from the error function in the

    weight space using the method of gradient descent. In gradient descent, weights are

    changed in proportion to the negative of an error derivative with respect to each weight. In

    this specific algorithm, the network follows the curvature of the error surface with weight

    update moving in the direction of the steepest descent. However there is a high probabil-

    ity that the network does not reach the global minimum since it may be stuck at the local

    minima which does not represent optimal solution.

    Momentum is a technique that can help the network out of the local minima. It is an

    extension of the backpropagation algorithm which can be helpful in speeding the con-

    vergence and avoiding local minima. With momentum, if the weights are moving in a

    particular direction, they tend to continue in the same direction. It is also noticed that

    momentum smoothes the weight changes. The momentum factor determines the effect of 

    past changes on current changes of weights and also increases the speed of the learning

    rate. The momentum factor which is constantly used has a value very close to  1. The ratio

    which influences the speed and quality of learning is called the learning rate. The learning

    rate plays a very important role in the learning process of a network, as it controls the size

    of the changes in weight at each iteration. Thus the right choice of the learning rate is ex-

    tremely important since a too small or a too big change in the size of the weight will affect

    the result of the network. A learning rate between 0.05 and 0.5 was found to provide good

    results in many practical cases.

    2.4 Training set and Testing set

    In neural network forecasting, we divide our data into a training set and a test set. The

    training set consists of the input data and the target data which need to be presented to the

    8

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    16/50

    2.4 Training set and Testing set

    network. However some transfer functions need the input and target data to be scaled so

    as they fall within a specified range. Thus in order to meet this requirement we pre-process

    our data by normalising the inputs and targets so that they fall in the interval of  [−1, 1].

    When these input data are presented to the network, the latter makes a guess of the correct

    answer and compares it with the target data. The network goes through the data again

    and again depending on the number of epochs used, adjust the weight value so as to reach

    a value close to the target value. The training set is used to build up the model whereas

    the test set which is independent of the training set is used to measure the performance of 

    the model. More precisely it is used to evaluate the out-of-sample performance. After the

    forecasts are obtained from the network we need to convert the data back to the original

    scale by the process called post-processing. Moreover, we find from previous studies that

    there is no precise rule on the optimum size of the two data set.

    9

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    17/50

    Chapter 3

    Linear and Non-linear Time Series

    In this chapter we consider the widely used time series models for forecasting. We first

     briefly describe the Autoregressive Integrated Moving Average (ARIMA) model which

    is a linear model and we then consider the Generalized Autoregressive Conditional Het-

    eroscedasticity (GARCH) model as the non-linear model.

    3.1 Stationary Processes

    A stochastic process  X tZ  is a family of random variables defined on a probability space.

    The joint cumulative distribution function of  X t is defined as

    F t1,t2,...,tn(x1, x2, . . . , xn) = P (X t1  ≤ x1, X t2  ≤ x2, . . . , X  tn  ≤ xn)   (3.1)

    and the process X t is said to be strictly stationary if 

    F t1+s,t2+s,...,tn+s(x1, x2, . . . , xn) = F t1,t2,...,tn(x1, x2, . . . , xn)   (3.2)

    for all n-tuples   (x1, x2, . . . , xn),   (t1, t2, . . . , tn)  and for any s The mean function of  X t   is

    given by

    µt =  E (xt) =

     

    xdF t(x)   (3.3)

    10

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    18/50

    3.1 Stationary Processes

    and the autocovariance function is given by

    γ (t, ) = E [(xt − µt)(xt− − µt−)]   (3.4)

    The process is said to be weakly stationary if  µt =  µ for all t and γ (t, ) = γ . In this case we

    have γ − =  γ .

    A white noise process εt is a process such that  µ  = 0 and  γ   =  σ2 and γ   = 0 for  = 0. We

    denote such a process by  εt ∼ WN (0, σ2).

    3.1.1 Autoregressive process

    The autoregressive process of order p  is denoted by AR( p) and it is defined as follows:

    X t =

     pr=1

    φrX t−r + εt   (3.5)

    where εt ∼ WN (0, σ2) or  φ(B)X t  =  εt, where B is the backshift operator, such that

    φ(B) = 1 − φ1B − φ2B

    2

    − · · · · · · − φ pB

     p

    (3.6)

    The process is invertible since p

    r=1 |φr| < ∞ and for the process to be stationary the roots

    of  φ(B) = 0 must lie outside the unit circle.

    3.1.2 Moving Average Process

    The moving average process of order q  is denoted by MA(q ) and is defined by

    X t  =q

    s=0

    θsεt−s   (3.7)

    or the process can also be written as X t  =  θ(B)εt where,

    θ(B) = 1 − θ1B − θ2B2 − · · · · · · − θqB

    q (3.8)

    11

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    19/50

    3.2 ARIMA ( p, d, q ) Processes

    Because MA processes consist of a finite sum of stationary white noise terms, they are sta-

    tionary hence they have mean zero. The process is invertible when the roots of  θ(B) all

    exceed unity in absolute value.

    3.1.3 Random Walk

    In a random walk model at each point of time, the series moves randomly away from its

    current position. The model can then be written as

    X t =  X 

    t−1 + ε

    t  (3.9)

    We see that the random walk model has the same form as an AR(1) process, but, since

    φ = 1, it is not stationary.

    Repeatedly substituting for past values gives

    X t =  X 0 +t−1 j=0

    εt− j   (3.10)

    We find that the first difference of the random walk is stationary, as it is just white noise:

    X t −X t−1 =  εt   (3.11)

    3.2 ARIMA ( p, d, q ) Processes

    An ARIMA model is a combination of the Autoregressive (AR), differencing and Moving

    Average (MA) processes.

    If the original process X 

    t is not stationary, we can look at the first order difference process

    Y t  =  X t −X t−1   (3.12)

    or the second order differences and so on.

    A general ARIMA model of order ( p, d, q ) can be represented as follows

    12

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    20/50

    3.2 ARIMA ( p, d, q ) Processes

    φ(B)∇dX t =  θ(B)εt   (3.13)

    where φ(B) is the AR operators of order  p and θ(B) is the MA operators of order  q .

    X t  is the observed value at time t,  εt  ∼  WN (0, σ2) and  d is the number of time the data

    series must be differenced to produce a stationary time series.

    Fitting an ARIMA model to the raw data involves the following four steps iterative cycles:

    model identification, estimation of parameters p,  d,  q , diagnostic checking and the fore-

    casting process.

    3.2.1 Autocorrelation and Partial Autocorrelation Functions

    The autocorrelation function (ACF) can be used to detect non-randomness in data and also

    to identify an appropriate time series model if the data are not random.

    Given the data,  x1,  x2, . . . ,  xN   at time  t1,   t2, . . . ,   tN , the lag  k   autocorrelation function is

    defined as:

    γ k  =

    N −ki=1   (xi − x̄)(xi+k − x̄)

    ki=1(xi − x̄)

    2(3.14)

    It is assumed that the observation is equi-spaced thus the time variable  t, is not used in the

    formula for autocorrelation. The correlation is between two values of the same variable

    at times  ti  and  ti+k. When the autocorrelation is used to identify an ARIMA model, the

    autocorrelations are plotted for many lags.

    The use of the partial autocorrelation function (PACF) was introduced to time series mod-

    elling, where one could easily determine the appropriate lags  p   in AR( p) model or the

    extended ARIMA ( p, d, q ) model by just plotting the PACF

    The PACF is the conditional correlation,

    Correlation(xt, xt+k|xt+1, .......,xt+k−1)   (3.15)

    Given a time series , the partial autocorrelation of lag  k, denoted, is the autocorrelation

     between xt and  xt+k with the linear dependence of  xt+1 through to xt+k−1 removed.

    13

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    21/50

    3.3 GARCH Model

    Process ACF PACF

    AR( p) Dies down exponentially or sinusoidally cuts off after lag p

    MA(q ) Cuts off after lag q    Dies down exponentially of sinusoidally

    ARMA( p, q )   Dies down exponentially or sinusoidally Dies down exponentially or sinusoidally

    Table 3.1: Behaviour of ACF and PACF

    3.3 GARCH Model

    The GARCH model which is a generalisation of the autoregressive conditional heteroske-

    daticity (ARCH) model was introduced by (Bollerslev 1986) and has been used by manyresearchers in modeling financial time series. It has been found that a wide range of fi-

    nancial data exhibit time varying volatility clustering which is the property that there are

    periods of high and low variance. In response to these, (Engle 1982) suggested the ARCH

    model as an alternative to the usual time series process.

    3.3.1 ARCH Model

    The autoregressive conditional heteroskedasticity (ARCH) model is the very first model of 

    conditional heteroskedasticity. It is a forecasting model which forecasts the error variance

    at time t  on the basis of information known at time  t − 1 and it is expressed as a moving

    average of the past error terms. The following conditional variance defines an ARCH

    model of order q :

    σ2t   = σ0 +

    qi=1

    αiε2t−i   (3.16)

    α0 ≥ 0, αi  ≥ 0 where αi must be estimated from the data.

    The error term will have the form:

    εt =  σtzt   (3.17)

    where zt   is a sequence of identically and independent distributed random variables with

    zero mean and unit variance.

    14

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    22/50

    3.3 GARCH Model

    3.3.2 GARCH model

    The GARCH model is an extended version of the ARCH model. In this GARCH( p, q )model the variance is a linear function of its own lags and has the form

    σ2t   = α0 + α1ε2t−1 + · · · + αqε

    2t−q + β 1σ

    2t−1 + · · · + β  pσ

    2t− p   (3.18)

    = α0 +

    qi=1

    αiε2t−i +

     pi=1

    β iσ2t−i   (3.19)

    The rate of decay of the ARCH model is considered to be to be too fast when compared

    with the usual financial series, unless the value of  q   in 3.16, is large. Thus the GARCH

    model is prefered since it enables very complicated heteroscedasticity patterns to be mod-

    eled at low orders of  p  and  q . The most popular GARCH model in applications has been

    the GARCH(1, 1) model, that is, p  = q  =  1 in 3.18.

    15

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    23/50

    Chapter 4

    Analysis of Exchange Rate Data

    We study the daily rates of MUR against the US dollar (MUR/USD), the Euro (MUR/EU)

    and the British Pound (MUR/GBP). The data sets were obtained from the Bank of Mau-

    ritius for the period of January 2003 to December 2011 for MUR/USD and MUR/EU and

    for the period of January 2002 to December 2011 for MUR/GBP.

    For our analysis, we choose two periods: the first period January 2003 to December 2008

    for MUR/USD and MUR/EU and January 2002 to December 2008 for MUR/GBP which

    represents the data till the financial crisis and for our second period data we take the orig-

    inal data obtained.

    The daily returns are calculated as the log differences of the levels. Let  xt  be a given ex-

    change rate time series, then the exchange rate return series  yt is given by

    yt  = ln

      xt

    xt−1

      (4.1)

    4.1 Analysis of Period I

    Figure 4.1, 4.2 and 4.3 the daily exchange rates data and their corresponding return of the

    MUR against USD, EU and GBP:

    16

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    24/50

    4.1 Analysis of Period I

    MUR/USD

    Time

                     U                  S                 D

    2003 2004 2005 2006 2007 2008 2009

                     2                 6

                     2                 8

                     3                 0

                     3                 2

    Log differenced of MUR/USD

    Time

                   r               e                   t               u               r               n               s

    2003 2004 2005 2006 2007 2008 2009

            −                    0

       .                    0                    1                    0

            −                    0

       .                    0                    0                    5

                        0   .                    0

                        0                    0

                        0   .                    0

                        0                    5

                        0   .                    0

                        1                    0

    Figure 4.1: Daily MUR/USD data and its corresponding first differences of the logs

    MUR/EURO

    Time

                   E               U               R                O

    2003 2004 2005 2006 2007 2008 2009

                   3               0

                   3               5

                   4               0

                   4               5

    Log differenced of MUR/EURO

    Time

                   r               e                   t               u               r               n               s

    2003 2004 2005 2006 2007 2008 2009        −

                        0   .                    0

                        3

            −                    0

       .                    0                    2

            −                    0

       .                    0                    1

                        0   .                    0

                        0

                        0   .                    0

                        1

                        0   .                    0

                        2

                        0   .                    0

                        3

    Figure 4.2: Daily MUR/EU data and its corresponding first differences of the logs

    17

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    25/50

    4.1 Analysis of Period I

    MUR/GBP

    Time

                    G                B                P

    2002 2003 2004 2005 2006 2007 2008 2009

                    4                5

                    5                0

                    5                5

                    6                0

                    6                5

    Log differenced of MUR/GBP

    Time

                   r               e                   t               u               r               n               s

    2002 2003 2004 2005 2006 2007 2008 2009

            −                    0

       .                    0                    6

            −                    0

       .                    0                    4

            −                    0

       .                    0                    2

                        0   .                    0

                        0

                        0   .                    0

                        2

                        0   .                    0

                        4

    Figure 4.3: Daily MUR/GBP data and its corresponding first differences of the logs

    In each of the three daily log returns, we observe that there is large random fluctua-

    tions. However all the three series appear to be stationary which mean thats the random

    variation is constant over time. There is volatility clustering in each of the log return series

    since we can see periods of high and low variation.

    Mean Median Max Min S.D Skewness Kurtosis

    MUR/USD   6.46 e-005 0.0000 0.0119 -0.0101 0.0018 1.0889 11.9486

    MUR/EU   2.70 e-004 1.32 e-004 0.0311 -0.279 0.0061 0.0698 5.1469

    MUR/GBP   3.80 e-005 0.0000 0.0462 -0.416 0.0058 -0.1038 9.8907

    Table 4.1: Summary statistics for the daily exchange rates: log first difference

    The standard deviations indicate that the MUR/EU return data is more volatile than

    that of the MUR/USD and MUR/GBP return series. The skewness coefficient for MUR/USD

    and MUR/EU are positive which indicate that the tail on the right is longer than the left

    side. For MUR/GBP the bulk of values lie to the right of the mean. Kurtosis is a measure

    of whether the data peaks or flattens relative to a normal distribution. For all the three

    18

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    26/50

    4.1 Analysis of Period I

    return series the kurtosis is larger than that of the normal distribution (which is equal to

    3), which in turn indicates leptokurtosis. The leptokurtosis indicates that the series are

    clustered during certain periods and the volatility changes at a relatively low rate thats is

    large changes tend to be followed by large changes and small changes by small changes.

    Histogram with Normal Curve

    return

                    F            r            e            q            u            e            n            c            y

    −0.010 −0.005 0.000 0.005 0.010

                    0

                    1                0                0

                    2                0                0

                    3                0                0

                    4                0                0

                    5                0                0

                    6                0                0

                    7                0                0

    −0.010 −0.005 0.000 0.005 0.010

                  0

                  2              0              0

                  4              0              0

                  6              0              0

                  8              0              0

                  1              0              0              0

                  1              2              0              0

    kernel density estimation

    N = 1500 Bandwidth = 0.0001054

                  D           e           n           s              i             t          y

    Figure 4.4: Histogram and Kernel Density estimate of daily log return of MUR/USD

    Histogram with Normal Curve

    return

                    F            r            e            q            u            e            n            c            y

    −0.06 −0.04 −0.02 0.00 0.02 0.04

                    0

                    2                0                0

                    4                0                0

                    6                0                0

                    8                0                0

    −0.06 −0.04 −0.02 0.00 0.02 0.04

                  0

                  2              0

                  4              0

                  6              0

                  8              0

    kernel density estimation

    N = 1750 Bandwidth = 0.00095

                  D           e           n           s              i             t          y

    Figure 4.5: Histogram and Kernel Density estimate of daily log return of MUR/GBP

    Histogram with Normal Curve

    return

                    F            r            e            q            u            e            n            c            y

    −0.03 −0.02 −0.01 0.00 0.01 0.02 0.03

                    0

                    1                0                0

                    2                0                0

                    3                0                0

                    4                0                0

                    5                0                0

    −0.03 −0.02 −0.01 0.00 0.01 0.02 0.03

                  0

                  2              0

                  4              0

                  6              0

                  8              0

    kernel density estimation

    N = 1500 Bandwidth = 0.001075

                  D           e           n           s              i             t          y

    Figure 4.6: Histogram and Kernel Density estimate of daily log return of MUR/EU

    19

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    27/50

    4.1 Analysis of Period I

    The plots indicate that the normality assumption is questionable for all the three daily

    log return series.

    Figure 4.7: Scatter plot of  yt against yt−1 for the MUR/USD log return data.

    Figure 4.8: Scatter plot of  yt against yt−1 for the MUR/EU log return data.

    Figure 4.9: Scatter plot of  yt against yt−1 for the MUR/GBP log return data.

    In the scatter plots shown above the data at time t  is plotted against the value at time

    t− 1. It is one way of showing the degree of correlation in the data

    20

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    28/50

    4.1 Analysis of Period I

    4.1.1 Analysis of Period II data

    Mean Median Max Min S.D Skewness Kurtosis

    MUR/USD   8.91 e-006 0.0000 0.0122 -0.0101 0.0021 0.7574 7.871

    MUR/EU   1.07 e-004 1.04 e-004 0.0311 -0.0279 0.0061 0.0843 5.044

    MUR/GBP   2.02 e-005 0.0000 0.0462 -0.0416 0.0060 -0.2152 8.379

    Table 4.2: Summary statistics of log first difference daily exchange rate

    We find that the data sets for the period of January 2003 to December 2011 for MUR/USD

    and MUR/EU and for the period of January 2002 to December 2011 produce the samecharacteristics as described above for the sample data sets. We observe that the minimum

    and the maximum value remain the same for the two periods, showing that in the sample

    data sets the three series had already attain their maximum and minimum value.

    MUR/USD

    Time

                   U                S               D

    2004 2006 2008 2010 2012

                   2               6

                   2               8

                   3               0

                   3               2

                   3               4

    Figure 4.10: Daily MUR/USD Jan 03 - Dec 11

    21

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    29/50

    4.2 Forecasting Accuracy

    MUR/EURO

    Time

                  E              U              R              O

    2004 2006 2008 2010 2012

                 3             0

                 3             5

                 4             0

                 4             5

    Figure 4.11: Daily MUR/EU Jan 03 - Dec 11

    MUR/GBP

    Time

                   G              B              P

    2002 2004 2006 2008 2010 2012

                  4              5

                  5              0

                  5              5

                  6              0

                  6              5

    Figure 4.12: Daily MUR/GBP Jan 02 - Dec 11

    4.2 Forecasting Accuracy

    For comparing forecasts produced by different models, the root mean square error, the

    mean absolute error and the mean absolute percentage error are calculated.

    et =  Y t −  Ŷ t

    denote the forecast error where Y t is the forecasted value of the observed value  Ŷ t.

    1. Root Mean Square Error

    RMSE =

      1n

    nt=1

    e2t

    22

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    30/50

    4.3 Fitting Neural Network Model to the Exchange Rates Data

    2. Mean Absolute Error

    MAE =   1n

    nt=1

    |et|

    3. Mean Absolute Percentage Error

    MAPE =

    1

    n

    nt=1

    |et|

    Y t

    × 100

    The MSE which is the most common measure of forecasting acccuracy indicates the de-gree of spread, however large errors are given additional weight. RMSE is considered,

    since the forecast error is then denoted in the same dimensions as the actual and forecast

    values themselves. It is found to be most informative for the errors with near normal dis-

    tribution. The mean absolute error is a very popular measure of forecast error since it

    compares forecast with their eventual outcomes. However it was emphasized by previous

    research that the MAPE is the most useful measure to compare the accuracy of forecasts

    since it measures relative performance.

    4.3 Fitting Neural Network Model to the Exchange Rates Data

    The choice of an appropriate architecture is in itself a very difficult task and to add up to it

    there is a large number of parameters which must be estimated.

    For our Neural Network model we used the feedforward neural network with a single

    hidden layer. The number of neurons in the hidden layer was varied between 1  and  5.

    The activation function in the hidden layer neuron is Tansigmoid (tansig) and that in the

    output layer in the linear function (purelin). To set up the learning rate we ran the net-

    work with a large number of different learning rates between  0.05 and  0.5 before settling

    on  0.25  which gave us the best results. The momentum value was analyzed and varied

     between 0.1  and  0.9  and we found that  0.4  and  0.8 gave us better results. However we

    use 0.8 during the experiments since it is common to choose the momentum value close to

    1. The training algorithm ’Traingdm’ was used since it provides faster convergence in our

    23

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    31/50

    4.4 Fitting ARIMA Model to the Foreign Exchange Rates Data

    feedforward network. The number of epochs used while training was between 30000 and

    50000 depending on the performance graph which shows us the point where the network

    was sufficiently trained.

    After having estimated these parameters, we focus our case study on the following issues

    which are firstly the number of input variables and secondly the number of hidden neu-

    rons in the hidden layer. All the computations regarding the neural network were carried

    out in Matlab.

    The number of input variables are based on the number of lagged past observation. The

    first input consists of training the data using only the first previous value Lag  (1) that is Y t

    is the target value thus we used  Y t−1 as the input. When we have 2 input variables we use

    Lag (1, 2) that is we use Y t−1 and  Y t−2 as input and Y t as target. The experiment is carried

    out using Lag (1) to Lag (1-12).

    We use the daily data of MUR/USD, MUR/EU and MUR/GBP we divide the data into

    the training set (In-sample) and testing set (Out-of-sample). The test set for all the data

    consists of the values of the whole year of 2008 which we shall try to forecast. For each

    input variable set we only experiment with 1-5 hidden nodes.

    4.4 Fitting ARIMA Model to the Foreign Exchange Rates Data

    ARIMA modeling consists of three stages which are the identification stage, the estimation

    and diagnostic checking stage and the forecasting stage.

    During the identification stage, we take our data and convert it into time series and find

    ACF and PACF. The stationary tests can be performed to determine if differencing is

    needed. The analysis of the AFC and PACF graphs usually suggest one or more tenta-tive models that can be fitted in our data.

    In the estimation and diagnostic checking stage, the diagnostic statistics help us to judge

    the adequacy of the models found in the identification stage. The goodness-of-fit statistics

    aid in compairing the model to others. The model with the least Akaike Information Cri-

    terion (AIC) and Bayesian information criterion (BIC) is retained.

    After inspecting the ACF and PACF for identifying the best model for ARIMA forecasting

    24

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    32/50

    4.5 Fitting the GARCH model

    different values of p, d and q are fitted and compared, if any of these model proved to

     be better then we fit model for the forecasting stage. Moreover there exists an auto.arima

    function in R which can help us to find a fitted model for the data set.

    4.5 Fitting the GARCH model

    The first step while fitting a GARCH model is to select the parameters of the specific model

    however we have check for the ARCH effect in the exchange rate data. From the log return

    plots of the MUR/USD, MUR/EU and MUR/GBP we find that there are heavy fluctua-

    tions in the data along with the presence of volatility clustering which gives us a hint that

    the data may not be identically and independently distributed (iid). The ACF and PACF

    for the log return, the absolute log return and the square of the log return is taken and if 

    they produce significant autocorrelation then the data is not iid and thus we can say that

    there exists the ARCH effect.

    The ARMA-GARCH model is fitted to the data and the parameters of the equations of the

    model is estimated by the ACF, PACF and EACF plot however the parameters values are

    varied and different models are fitted to the in-sample data. Both the in-sample and out-

    of-sample forecast performance are produced.

    25

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    33/50

    4.6 Empirical findings

    4.6 Empirical findings

    4.6.1 Forecasts performance of Period I

    Lags No of epochs No of neurons RMSE MAE MAPE

    ARIMA(3,1,0) 0.0273 0.0172 0.0586

    ARIMA(2,1,3) 0.0270 0.0172 0.0587

    AR(1)-GARCH(1,1) 0.0271 0.0172 0.0587

    1 50000 2 0.0384 0.0249 0.1000

    1-2 50000 2 0.0321 0.0219 0.0744

    1-3 50000 2 0.0308 0.0194 0.0658

    1-4 50000 4 0.0303 0.0199 0.0673

    1-5 50000 5 0.0322 0.0207 0.0706

    1-6 50000 3 0.0327 0.0216 0.0735

    1-7 50000 4 0.0322 0.0210 0.0712

    1-8 50000 2 0.0338 0.0224 0.0760

    1-9 50000 3 0.0310 0.0197 0.0672

    1-10 50000 4 0.0306 0.0197 0.0670

    1-11 50000 2 0.0310 0.0203 0.0689

    1-12 50000 1 0.0354 0.0264 0.0894

    Table 4.3: In-sample performance of MUR/USD for data Jan 2003-Dec 2008

    26

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    34/50

    4.6 Empirical findings

    Lags No of epochs No of neurons RMSE MAE MAPE

    ARIMA(1,0,0) 0.1993 0.1553 0.4219

    ARIMA(0,1,0) 0.1993 0.1552 0.4214

    AR(1)-GARCH(1,2) 0.1990 0.1551 0.4213

    1 40000 1 0.2026 0.1595 0.4336

    1-2 50000 2 0.2007 0.1569 0.4260

    1-3 50000 4 0.2008 0.1579 0.4282

    1-4 50000 3 0.2007 0.1573 0.4273

    1-5 50000 2 0.1996 0.1558 0.4230

    1-6 50000 4 0.1995 0.1559 0.42331-7 50000 4 0.2010 0.1574 0.4269

    1-8 40000 1 0.2029 0.1599 0.4343

    1-9 50000 4 0.2020 0.1578 0.4278

    1-10 50000 5 0.1996 0.1563 0.4235

    1-11 50000 4 0.2006 0.1563 0.4320

    1-12 50000 5 0.2001 0.1564 0.4243

    Table 4.4: In-sample performance of MUR/EU for data Jan 2003-Dec 2008

    27

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    35/50

    4.6 Empirical findings

    Lags No of epochs No of neurons RMSE MAE MAPE

    ARIMA(1,0,0) 0.2710 0.2032 0.3828

    ARIMA(0,1,0) 0.2709 0.2030 0.3823

    AR(1)-GARCH(1,1) 0.2700 0.2026 0.3943

    1 40000 1 0.2776 0.2097 0.3960

    1-2 50000 2 0.2733 0.2050 0.3868

    1-3 40000 1 0.2732 0.2052 0.3872

    1-4 50000 2 0.2739 0.2052 0.3872

    1-5 40000 1 0.2773 0.2096 0.3957

    1-6 50000 3 0.2750 0.2067 0.38881-7 50000 2 0.2716 0.2057 0.3879

    1-8 50000 2 0.2710 0.2041 0.3848

    1-9 50000 5 0.2752 0.2095 0.3945

    1-10 50000 3 0.2736 0.2058 0.3865

    1-11 50000 4 0.2725 0.2056 0.3865

    1-12 50000 2 0.2718 0.2057 0.3874

    Table 4.5: In-sample performance of MUR/GBP for data Jan 2002-Dec 2008

    28

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    36/50

    4.6 Empirical findings

    Lags No of epochs No of neurons RMSE MAE MAPE

    ARIMA(3,1,0) 2.011 1.588 5.331

    ARIMA(2,1,3) 2.331 1.615 5.250

    AR(1)-GARCH(1,1) 2.161 1.562 5.131

    1 50000 2 0.0979 0.0682 0.2338

    1-2 50000 2 0.0704 0.0493 0.1687

    1-3 30000 1 0.0754 0.0548 0.1871

    1-4 50000 2 0.0760 0.0537 0.1835

    1-5 30000 1 0.0746 0.0554 0.1896

    1-6 30000 1 0.0864 0.0639 0.21881-7 30000 1 0.0843 0.0621 0.2121

    1-8 50000 5 0.0786 0.0576 0.1970

    1-9 30000 1 0.0791 0.0589 0.2014

    1-10 50000 2 0.0815 0.0570 0.1942

    1-11 50000 3 0.0864 0.0638 0.2178

    1-12 50000 2 0.0784 0.0574 0.1964

    Table 4.6: Out-of-sample performance of MUR/USD for data Jan 2003-Dec 2008

    29

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    37/50

    4.6 Empirical findings

    Lags No of epochs No of neurons RMSE MAE MAPE

    ARIMA(1,0,0) 1.271 0.8503 1.955

    ARIMA(0,1,0) 1.119 0.8037 1.868

    AR(1)-GARCH(1,2) 1.265 0.8439 1.940

    1 30000 1 0.3646 0.2642 0.6196

    1-2 50000 2 0.3640 0.2654 0.6229

    1-3 30000 1 0.3647 0.2643 0.6200

    1-4 30000 1 0.3637 0.2636 0.6182

    1-5 50000 2 0.3634 0.2632 0.6174

    1-6 30000 1 0.3642 0.2642 0.61971-7 30000 1 0.3650 0.2649 0.6219

    1-8 50000 3 0.3631 0.2646 0.6213

    1-9 30000 1 0.3650 0.2653 0.6224

    1-10 50000 2 0.3638 0.2658 0.6237

    1-11 30000 1 0.3638 0.2657 0.6236

    1-12 30000 1 0.3631 0.2655 0.6230

    Table 4.7: Out-of-sample performance of MUR/EU for data Jan 2003-Dec 2008

    30

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    38/50

    4.6 Empirical findings

    Lags No of epochs No of neurons RMSE MAE MAPE

    ARIMA(1,0,0) 4.429 3.755 7.244

    ARIMA(0,1,0) 4.537 3.853 7.431

    ARMA(1,1)-GARCH(1,1) 4.137 3.477 6.708

    1 30000 1 0.4796 0.3431 0.6497

    1-2 30000 1 0.4772 0.3410 0.6456

    1-3 50000 2 0.4772 0.3405 0.6447

    1-4 30000 1 0.4794 0.3402 0.6443

    1-5 50000 2 0.4776 0.3438 0.6498

    1-6 30000 1 0.4845 0.3433 0.65011-7 50000 2 0.4838 0.3477 0.6581

    1-8 50000 2 0.4825 0.3450 0.6530

    1-9 30000 1 0.4841 0.3431 0.6498

    1-10 30000 1 0.4848 0.3435 0.6506

    1-11 50000 5 0.4807 0.3430 0.6495

    1-12 30000 1 0.4850 0.3439 0.6512

    Table 4.8: Out-of-sample performance of MUR/GBP for data Jan 2002-Dec 2008

    ANN ARIMA GARCH

    Year/daily Actual values Forecasts Errors Forecasts Errors Forecast Errors

    2008/1 57.1351 57.4654 -0.3303 57.2970 -0.1619 57.3049 -0.1698

    2008/2 57.1221 57.2047 -0.0826 57.2961 -0.174 57.3041 -0.182

    2008/3 57.1823 57.1134 0.0689 57.2953 -0.113 57.3032 -0.1209

    2008/4 57.3500 57.1434 0.2066 57.2944 0.0556 57.3024 0.0476

    2008/5 56.9235 57.253 -0.3295 57.2936 -0.3701 57.3106 -0.3871

    2008/6 57.2858 56.8477 0.4381 57.2928 -0.007 57.3007 -0.0149

    2008/7 57.2978 57.2938 0.004 57.2919 0.0059 57.2999 -0.0021

    2008/8 57.2383 57.1821 0.0562 57.2911 -0.0528 57.2991 -0.0608

    2008/9 57.3006 57.2182 0.0824 57.2903 0.0103 57.2982 0.0024

    2008/10 57.4102 57.235 0.1752 57.2894 0.1208 57.2974 0.1128

    Table 4.9: First 10 forecasts values of MUR/GBP for ANN, ARIMA and GARCH models

    31

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    39/50

    4.6 Empirical findings

    4.6.2 Results and Discussions for Period I

    The tables 4.3 to 4.8 show the minimum error obtained when we vary the number of neu-rons (between 1 − 5) in the hidden layer for each Lag  (1) to Lag (1 − 12). We observe from

    our data that as the number of neurons in the hidden layer increases the amount of epochs

    to train the network must also increase so as to reach a point where the forecast value does

    not change anymore. Thus increasing the number of epochs when the network has already

    reached the global minima is useless and very time consuming. We reach the conclusion

    that the number of epochs used is dependent to the number of neurons in the hidden layer.

    Moreover we also notice that as the number of neurons increases the performance of the

    network changes. We tend to achieve the best forecasts when we have 2  or  3  neurons inthe hidden layer and as the number of neurons increases from  5  the forecast values tend

    to move away from the target values since great increase in the RMSE, MAE and MAPE

    values are observed.

    In the tables above the in-sample performance are presented followed by the out-of-sample

    performance. Here the In-sample data set for MUR/USD and MUR/EU is taken from Jan-

    uary 2003-December 2007 and for MUR/GBP is taken from January 2002-December 2007.

    The out-of-sample data set in all the three cases is taken for January 2008-December 2008.

    As for the number of inputs presented to the network, we cannot find any trend about how

    it affects the performance. In our experiment for both the in-sample and out-of-sample

    forecast we observe that each data set has a different number of inputs where the neural

    network works best.

    Considering the in-sample forecast we find that for MUR/USD the ARIMA(2, 1, 3) model

    outperforms the ANN models however the AR(1)-GARCH(1, 1)  model gives almost the

    same forecast accuracy. As for the MUR/EU and MUR/GBP model the GARCH models

    perform much better than the other models. Thus we can conclude that for in-sample fore-

    casts the GARCH model produces better forecasts than the ARIMA and the ANN models.

    For out-of-sample forecasts the GARCH models used for both MUR/USD and MUR/GBP

    gives a superior results than the ARIMA models when the RMSE, MAE and MAPE are

    considered as evalution criteria. However when compared with the ANN model the fore-

    cast performance is found to be far behind.

    The table 4.9 show that first 10 out-of-sample forecasts of MUR/GBP for ANN, ARIMA

    and the GARCH models. We notice that the three models perform very accurately for the

    32

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    40/50

    4.6 Empirical findings

    first 10 forecasts and since the MAPE values for the forecasts of year 2008 of the ARIMA

    and GARCH models are quite inferior to the ANN model value we can say that as the

    ANN continue to predict more values it out-perform the other two models.

    The diagrams 4.13, 4.14 and 4.15 shows the In-sample and Out-of-sample forecasts

    produced by the ANN models:

    Figure 4.13: In-sample and out-of-sample forecast for MUR/USD

    Figure 4.14: In-sample and out-of-sample forecast for MUR/EU

    33

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    41/50

    4.6 Empirical findings

    Figure 4.15: In-sample and out-of-sample forecast for MUR/GBP

    4.6.3 Forecasts performance of Period II

    Lags No of Epochs No of Neurons RMSE MAE MAPE

    ARIMA(5,1,0) 0.0489 0.0313 0.1025

    ARIMA(2,1,3) 0.0489 0.0314 0.1026

    AR(1)-GARCH(1,1) 0.0049 0.0314 0.1031

    1 50000 1 0.0671 0.0454 0.15051-2 50000 2 0.0547 0.0368 0.1214

    1-3 50000 2 0.0551 0.0364 0.1200

    1-4 50000 2 0.0537 0.0363 0.1198

    1-5 50000 3 0.0544 0.0374 0.1236

    1-6 40000 1 0.0562 0.0396 0.1312

    1-7 50000 2 0.0536 0.0363 0.1200

    1-8 50000 2 0.0534 0.0359 0.1186

    1-9 40000 1 0.0561 0.0392 0.12991-10 50000 2 0.0544 0.0362 0.1193

    1-11 50000 3 0.0581 0.0389 0.1284

    1-12 50000 5 0.0536 0.0357 0.1177

    Table 4.10: In-sample performance MUR/USD data

    34

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    42/50

    4.6 Empirical findings

    Lags No of epochs No of neurons RMSE MAE MAPE

    ARIMA(1,0,0) 0.2445 0.1806 0.4571

    ARIMA(0,1,0) 0.2445 0.1805 0.4568

    AR(1)-GARCH(1,1) 0.2245 0.1806 0.4571

    1 50000 2 0.2442 0.1817 0.4587

    1-2 50000 3 0.2445 0.1818 0.4591

    1-3 50000 2 0.2453 0.1833 0.4629

    1-4 50000 3 0.2461 0.1841 0.4646

    1-5 50000 2 0.2436 0.1807 0.4560

    1-6 50000 2 0.2448 0.1815 0.45811-7 50000 5 0.2437 0.1813 0.4575

    1-8 50000 4 0.2431 0.1805 0.4553

    1-9 50000 3 0.2445 0.1820 0.4589

    1-10 50000 2 0.2439 0.1811 0.4565

    1-11 50000 5 0.2442 0.1809 0.4560

    1-12 50000 2 0.2452 0.1826 0.4611

    Table 4.11: In-sample performance MUR/EU data

    35

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    43/50

    4.6 Empirical findings

    Lags No of epochs No of neurons RMSE MAE MAPE

    ARIMA(1,0,0) 0.3181 0.2318 0.4443

    ARIMA(0,1,0) 0.3181 0.2318 0.4441

    AR(1)-GARCH(1,1) 0.2704 0.2026 0.3942

    1 50000 2 0.3120 0.2270 0.4383

    1-2 50000 2 0.3118 0.2273 0.4358

    1-3 30000 1 0.3098 0.2262 0.4263

    1-4 50000 3 0.3113 0.2284 0.4405

    1-5 50000 4 0.3088 0.2260 0.4363

    1-6 50000 2 0.3115 0.2287 0.44101-7 50000 4 0.3110 0.2275 0.4391

    1-8 50000 2 0.3092 0.2251 0.4392

    1-9 50000 2 0.3143 0.2316 0.4477

    1-10 50000 3 0.3113 0.2278 0.4396

    1-11 50000 4 0.3105 0.2278 0.4398

    1-12 50000 2 0.3098 0.2271 0.4381

    Table 4.12: In-sample performance MUR/GBP data

    36

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    44/50

    4.6 Empirical findings

    Lags No of Epochs No of Neurons RMSE MAE MAPE

    ARIMA(5,1,0) 1.234 1.220 4.082

    ARIMA(2,1,3) 1.233 1.220 4.080

    AR(1)-GARCH(1,2) 1.233 1.221 4.082

    1 30000 1 0.0501 0.0356 0.1189

    1-2 30000 2 0.0539 0.0422 0.1441

    1-3 30000 1 0.0502 0.0404 0.1349

    1-4 30000 1 0.0494 0.0394 0.1317

    1-5 50000 2 0.0495 0.0386 0.1291

    1-6 30000 1 0.0501 0.0387 0.12951-7 50000 2 0.0512 0.0410 0.1371

    1-8 30000 1 0.0505 0.0396 0.1324

    1-9 50000 2 0.0512 0.0406 0.1357

    1-10 30000 1 0.0472 0.0368 0.1229

    1-11 30000 1 0.0518 0.0405 0.1352

    1-12 30000 1 0.0222 0.0166 0.0553

    Table 4.13: Out-of-sample performance MUR/USD data

    37

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    45/50

    4.6 Empirical findings

    Lags No of Epochs No of Neurons RMSE MAE MAPE

    ARIMA(1,0,0) 1.772 1.735 4.408

    ARIMA(0,1,0) 1.780 1.742 4.426

    AR(1)-GARCH(1,1) 1.776 1.739 4.419

    1 30000 1 0.1427 0.1099 0.2783

    1-2 30000 2 0.1424 0.1103 0.2791

    1-3 50000 4 0.1381 0.1047 0.2645

    1-4 30000 1 0.1424 0.1102 0.2790

    1-5 30000 1 0.1423 0.1107 0.2802

    1-6 30000 1 0.1437 0.1122 0.28411-7 50000 2 0.1433 0.1120 0.2835

    1-8 30000 1 0.1452 0.1147 0.2904

    1-9 50000 2 0.1590 0.1237 0.3133

    1-10 50000 2 0.1470 0.1105 0.2798

    1-11 30000 1 0.1410 0.1107 0.2803

    1-11 50000 2 0.1394 0.1170 0.2961

    1-12 30000 1 0.1389 0.1079 0.2730

    Table 4.14: Out-of-sample performance of MUR/EU data

    38

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    46/50

    4.6 Empirical findings

    Lags No of Epochs No of Neurons RMSE MAE MAPE

    ARIMA(1,0,0) 1.993 1.981 4.249

    ARIMA(0,1,0) 1.934 1.921 4.119

    AR(1)GARCH(1,2) 1.996 1.985 4.257

    1 30000 1 0.2396 0.1807 0.3694

    1-2 30000 1 0.2440 0.1831 0.3745

    1-3 50000 3 0.1985 0.1624 0.3332

    1-4 30000 1 0.2048 0.1648 0.3381

    1-5 50000 5 0.1990 0.1463 0.3003

    1-6 30000 1 0.1998 0.1470 0.30071-7 50000 3 0.1994 0.1506 0.3089

    1-8 50000 3 0.2025 0.1593 0.3270

    1-9 50000 4 0.1955 0.1535 0.3149

    1-10 30000 1 0.2141 0.1689 0.3466

    1-11 50000 2 0.2142 0.1711 0.3511

    1-12 50000 2 0.2029 0.1506 0.3087

    Table 4.15: Out-of-sample performance of MUR/GBP data

    4.6.4 Results and Discussions for Period II

    In this experiment the in-sample set for MUR/USD and MUR/EU consists the values from

     January 2003 to November 2011 whereas the in-sample set for MUR/GBP consists the val-

    ues from January 2002 to November 2011. The out-of-sample set for each of the three series

    consists the value of December 2011.

    Considering the in-sample forecasts of the MUR/USD data we find that the ARIMA model

    outperform all the other models. As for the MUR/EU data the ANN model using the

    Lag(1 − 8) produces a superior results since the RMSE, MAE and MAPE values are small

    compare to the random walk, ARIMA and GARCH model. In the MUR/GBP forecasts the

    GARCH model used produces better forecasts.

    For the out-of-sample forecast we find that the ANN models outperform the ARIMA,

    GARCH and random walk model for all the three series.

    39

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    47/50

    4.6 Empirical findings

    Period I Period II

    ANN ARIMA GARCH ANN ARIMA GARCH

    RMSE   0.0308 0.0273 0.0271 0.0536 0.0489 0.0499

    In-sample MAE   0.0194 0.0172 0.0172 0.0357 0.0313 0.0314

    MAPE   0.0658 0 .0586 0.0587 0.1177 0 .1025 0.1031

    RMSE   0.0704 2.011 2.161 0.0222 1.233 1.233

    Out-of-sample MAE   0.0493 1.588 1.562 0.0166 1.220 1.221

    MAPE   0.1687 2.331 5.131 0.0553 4.080 4.082

    Table 4.16: In-sample and Out-of-sample forecasts of MUR/USD

    Period I Period II

    ANN ARIMA GARCH ANN ARIMA GARCH

    RMSE   0.1996 0.1993 0.1990 0.2436 0.2445 0.2245

    In-sample MAE   0.1563 0.1553 0.1551 0.1807 0.1806 0.1806

    MAPE   0.4235 0 .4219 0.4213 0.4560 0 .4571 0.4571

    RMSE   0.3634 1.271 1.265 0.1389 1.772 1.776

    Out-of-sample MAE   0.2632 0.8503 0.8439 0.1079 1.735 1.739

    MAPE   0.6174 1.955 1.940 0.2730 4.408 4.419

    Table 4.17: In-sample and Out-of-sample forecasts of MUR/EU

    Period I Period II

    ANN ARIMA GARCH ANN ARIMA GARCH

    RMSE   0.2710 0.2710 0.2700 0.3098 0.3181 0.2704

    In-sample MAE   0.2041 0.2032 0.2026 0.2262 0.2318 0.2026

    MAPE   0.3848 0 .3828 0.3843 0.4263 0 .4443 0.3942

    RMSE   0.4794 4.429 4.137 0.1990 1.993 1.996

    Out-of-sample MAE   0.3402 3.755 3.477 0.1463 1.981 1.985

    MAPE   0.6443 7.244 6.708 0.3003 4.249 4.257

    Table 4.18: In-sample and Out-of-sample forecasts of MUR/GBP

    40

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    48/50

    Chapter 5

    Conclusion

    In this project, linear and nonlinear models were used to forecast the exchange rates of the

    Mauritius Rupee against three foreign currencies: the US Dollar, the Euro and the British

    Pound. Accuracy of forecasting models for two periods of data first from January 2002

    to December 2008 and the second from January 2002 to December 2011 were taken into

    consideration. The reason for this choice is that we wanted to test the ability of our fore-

    casting procedures to provide accurate out-of-sample forecasts during the financial crisis.

    The artificial neural network (ANN) which is a nonlinear model was used as an alternative

    model to the linear ARIMA processes and the nonlinear GARCH models. The empirical

    results show that the ANN models has a superior out-of-sample forecasting performance

    for both periods data when compared to the other models. For the in-sample forecasts we

    observed that ARIMA and ARMA-GARCH models provided better goodness-of-fit than

    the ANN models.

    One of the reasons behind this is that there is still no well defined guidelines to build up an

    ANN model to solve a specific problem. Thus to obtain the best possible ANN forecasting

    model rigorous experiments were carried out so as to determine the different parametersto build up the model. Considering the out-of-sample performance of the ANN models

    we can say that they proved to be successful when fitted to the foreign exchange rates data

    provided extreme care is taken in designing the network. Thus we can conclude that the

    ANN model can be used as a complementary tool to different time-series models and the

    forecast accuracies can be further improved.

    41

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    49/50

    Bibliography

    Bollerslev, T. (1986), ‘Generalized autoregressive conditional heteroscedasticity’,  Journal of 

    economics 31, 307–327.

    Bollerslev, T. & Ghysels, E. (1996), ‘Periodic autoregressive conditional heteroscedasticity’,

     J.Bus.Eco.Stat 14, 139–151.

    Cao, L. & Tay, F. (2001), ‘Financial forecasting using support vector machines’, Neural Com-

     puter and Application 10, 184–192.

    Engle, R. F. (1982), ‘Autoregressive conditional heteroscedasticity with estimates of the

    variance of united kingdom inflation’, Economica 50, 909–927.

    Gencay, R. (1999), ‘Linear, non-linear and essential foreign exchange rate prediction withsimple technical trading rules’, Journal of International Economics 47, 91–107.

    Haoffi, Z., X. G. Y. F. & Han, Y. (2007), ‘A neural network model based on the multistage

    optimization approach for short term food price forecasting in china’,  Expert Syst. Appl.

    33, 347–356.

    Meade, N. (2002), ‘A comparison of the accuracy of short term foreign exchange forecasting

    methods’, International Journal of Forecasting 18, 67–83.

    Panda, C. & Narasimhan, V. (2007), ‘Forecasting exchange rate better with artificial neuralnetwork’, Journal of Policy Modeling 29, 227–236.

    R. K. Bissoondeeal, J. M. Binner, M. B. A. G. & Mootanah, V. P. (2008), ‘Forecasting exchange

    rates with linear and nonlinear models’,  Global Business and Economics Review  10, 414–

    429.

    Racine, J. (2001), ‘On the nonlinear predictability of stock returns using financial and eco-

    nomic variables’, Business Econ.Stat. 19, 380–382.

    42

  • 8/20/2019 Jory Namrata Devi 2012 (1)

    50/50

    BIBLIOGRAPHY

    T. Hill, M. O. & Remus, W. (1996), ‘Neural network models for time series forecasts’,  Man-

    agement Science 42, 1082–1092.

    Tang, Z. & Fishwich, P. A. (1993), ‘Backpropagation neural nets as models for time series

    forecasting’, ORSA Journal on Computing 5, 374–385.

    W. R. Foster, F. C. & Ungar, L. H. (1992), ‘Neural network forecasting of short, noisy time

    series’, Computers and Chemical Engineering 16, 293–297.

    Wang, J. H. & Leu, J. Y. (1996), ‘Stock market trend prediction using arima-based neural

    networks’, Proc.of IEEE Int.Conf.on Neural Networks 4, 2160–2165.

    Yaser, S. & Atiya, A. (1996), ‘Introduction to financial forecasting’,   Applied Intelligence6, 205–213.