Павел Филонов, «Лаборатория Касперского», Глубокое обучение и извлечение признаков в прогнозировании

Deep learning and feature extraction for timeseries forecasting

Pavel [email protected]

27 May 2016

mailto:[email protected]

Outlines

MotivationCyber Physical Security

Problem formulationAnomaly detectionTime series forecasting

Artificial Neural NetworksBasic modelRNN on raw dataFeature engineeringRNN on extracted featuresQuasi-periodic timeseries

Conclusions

Cyber Physical Security

Image from http://www.wallpaperup.com

http://www.wallpaperup.com/455976/STUXNET_virus_iran_nuclear_computer_political_anarchy_windows_microsoft_cyber_hacker_hacking.html

”Pipeline” stand

Signal timeseries

Anomaly detection

Time series forecasting

Forecasting models

I Auto-regression models and EMA (ARMA, ARIMA, GARCH)

I Neural networks

I Adaptive short term forecasting

I Adaptive auto-regression

I Adaptive model selection

I Adaption model composition

I Density forecast

I Quantile regression

I ...

Neural networks for timeseries forecasting

I Feed forward NN on window1

I Recurrent NNI Hopfield networksI Elman networksI Long short term memory2

I Gated Recurrent Unit3

1https://www.cs.cmu.edu/afs/cs/academic/class/15782-f06/slides/timeseries.pdf

2http://colah.github.io/posts/2015-08-Understanding-LSTMs/3http://arxiv.org/pdf/1406.1078v3.pdf

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

http://arxiv.org/pdf/1406.1078v3.pdf

Neuron model

I xi — inputs

I b — biasI f — activation function

I σ(t) = 11+e−t

I tanh(t) = e2t−1e2t+1

I f(t) = tI f(t) = H(t)

I y — output

Figure: Single neuron

Figure: Multilayer feedforward neuralnetwork

LSTM

ft = σ(Wf · [ht−1, xt] + bf )

it = σ(Wi · [ht−1, xt] + bi)

C̃t = tanh(WC · [ht−1, xt] + bC)

Ct = ftCt−1 + itC̃t

ot = σ(Wo · [ht−1, xt] + bo)

ht = ot tanh(Ct)

Picture from: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

RNN on raw data

NN topology: 722 input→ 64 LSTM+Dropout(0.2)→ 722 LinearForecast horizon: 5 minutes

Timeseries segmentation

Segmentation

FeaturesextractionClustering

...

signal segments

Features matrix

Clusters Sequence of labels

RNN on extracted features

Let n be the number of clusters.NN structure: n inputs→ 10n LSTM→ n SoftMaxForecast horizon: 20 segments

Quasi-periodic timeseries

RNN on Quasi-periodic timeseries

NN structure:

61→ 32 LSTM+Dropout(0.2)→ 64 LSTM+Dropout(0.2)→ 1 Linear

Forecast horizon: 1 minute

Quasi-periodic timeseries

NN structure:

61→ 32 LSTM+Dropout(0.2)→ 64 LSTM+Dropout(0.2)→ 1 Linear

Forecast horizon: 1 minute

Conclusions

Picture from: http://www.simpsonscreative.co.uk/kiss-the-first-law-of-successful-copywriting/

References

I http://keras.io/

I

https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2015-56.pdf

I Keras recurrent tutorial -https://github.com/Vict0rSch/deep learning/tree/master/keras/recurrent

I https://github.com/aurotripathy/lstm-anomaly-detect

I https://github.com/aurotripathy/lstm-ecg-wave-anomaly-detect

I http://simaaron.github.io/Estimating-rainfall-from-weather-radar-readings-using-recurrent-neural-networks/

I http://danielhnyk.cz/predicting-sequences-vectors-keras-using-rnn-lstm/

Software

Павел Филонов, «Лаборатория Касперского», Глубокое обучение и извлечение признаков в прогнозировании