38
Anomaly detection & Recommend Engine MarioCho( 조만석 ) [email protected] 온수당 머신러닝 12주차

EMT machine learning 12th weeks : Anomaly detection

Embed Size (px)

Citation preview

Page 1: EMT machine learning 12th weeks : Anomaly detection

Anomaly detection & Recommend Engine�Mario Cho (조만석)�

[email protected]

온수당머신러닝 12주차

Page 2: EMT machine learning 12th weeks : Anomaly detection

Who am I ?Development Experience◆ Image Recognition using Neural Network◆ Bio-Medical Data Processing◆ Human Brain Mapping on High Performance

Computing◆ Medical Image Reconstruction

(Computer Tomography) ◆Enterprise System◆Open Source Software Developer

Open Source Software Developer◆ Linux Kernel & LLVM ◆ OPNFV (NFV&SDN) & OpenStack◆ Machine Learning (TensorFlow)

Book◆ Unix V6 Kernel

Korea Open Source Software Lab.Mario Cho

[email protected]

Page 3: EMT machine learning 12th weeks : Anomaly detection

Problem Motivation• Just like in other learning problems,

• Want to know a given dataset is abnormal/anomalous or not?

• define a "model" - that tells us the probability the example is not anomalous. - also use a threshold (epsilon) as a dividing line - so we can say which examples are anomalous or not.

Page 4: EMT machine learning 12th weeks : Anomaly detection

Example of Anomaly detection• Aircraft engine features:

• Dataset: { x(1), x(2), x(3), ,,, , x(m), }• New engine: xtest

• Features - x1 = heat generated- x2 = vibration intensity- x3 = …- ...- xm = ...

Page 5: EMT machine learning 12th weeks : Anomaly detection

Example of Anomaly detection• Aircraft engine features:

• Features - x1 = heat generated- x2 = vibration intensity

Page 6: EMT machine learning 12th weeks : Anomaly detection

Example of Anomaly detection• Density estimation

• Dataset: { x(1), x(2), x(3), ,,, , x(m)}• Is “New engine: xtest” anomalous?

Model p(x) 에 대하여.

P(xtest ) < E à flag anomaly

P(xtest ) >= E à not anomaly, normal

Page 7: EMT machine learning 12th weeks : Anomaly detection

Monitoring computers in a data center

Page 8: EMT machine learning 12th weeks : Anomaly detection

Anomaly detect process

Page 9: EMT machine learning 12th weeks : Anomaly detection

Anomaly detection example• Fraud detection

• X(i)= features of user I’s activities• Model p(x) from data• Identify unusual users by checking with have p(x) < E

• Manufacturing• X(i)= features of process I’s• Model p(x) from measured data• Identify unusual product by checking with have p(x) < E

• Monitoring computer in a data center• X(i)= features of machine I• X1 = memory use,• X2 = number of disk accesses / sec • X3 = CPU load• Identify unusual status by checking with have p(x) < E

Page 10: EMT machine learning 12th weeks : Anomaly detection

Gaussian (Normal) distribution

Page 11: EMT machine learning 12th weeks : Anomaly detection

Gaussian distribution

Page 12: EMT machine learning 12th weeks : Anomaly detection

Parameter estimation• Dataset: { x(1), x(2), x(3), ,,, , x(m) }

Page 13: EMT machine learning 12th weeks : Anomaly detection

Density estimation• Training sets: { x(1), x(2), x(3), ,,, , x(m)}

Page 14: EMT machine learning 12th weeks : Anomaly detection

Anomaly detection algorithm

Page 15: EMT machine learning 12th weeks : Anomaly detection

Example of Anomaly detection

P(xtest(1) ) = 0.0426

P(xtest(1) ) >= E (0.02)

P(xtest(1) ) : normal

P(xtest(2) ) = 0.0021

P(xtest(2) ) < E (0.02)

P(xtest(2) ) : anormal

Page 16: EMT machine learning 12th weeks : Anomaly detection

The importance of real-number evaluation

Page 17: EMT machine learning 12th weeks : Anomaly detection

Aircraft engines motivating

Page 18: EMT machine learning 12th weeks : Anomaly detection

Algorithm evaluation

Page 19: EMT machine learning 12th weeks : Anomaly detection

Anomaly detection vs. Supervised learning• Detect very small number• Positive (y = 1) : 0~20• Negative (y = 0 ) : Large

• Many different “types” of anomalies.

• Hard to adaptive similar learning

• Future anomalies may look nothing like any of the anomalous examples we’ve seen so far.

• Positive & Negative are large• Positive (y = 1) : Large• Negative (y = 0 ) : Large

• Enough positive example for algorithm to get a sense of what positive example are like

• Many different “types” of anomalies.

• Easy to adaptive similar learning

• Future positive exaple likely to be similar to ones in training set

Page 20: EMT machine learning 12th weeks : Anomaly detection

Anomaly detection vs. Supervised learning• Fraud detection

• Manufacturing • Ex)

• aircraft engines• Manufacturing processing

• Monitor machine

• Email spam classification

• Weather prediction (sunny/ rainy / cloud)

• Cancer classification

Page 21: EMT machine learning 12th weeks : Anomaly detection

Choosing what features to use

Page 22: EMT machine learning 12th weeks : Anomaly detection

Error analysis for anomaly detection• Want

• P(x) large for normal examples x.• P(x) small for anomalous examples x.

• Most common problem:• P(x) is comparable (say, both large) for normal and anomalous

Page 23: EMT machine learning 12th weeks : Anomaly detection

Monitoring computers in a data center• Choose feature that might take on unusually large or small

value in the event of an anomaly

• X(i)= features of machine I• X1 = memory use,• X2 = number of disk accesses / sec • X3 = CPU load• X4 = Network traffic

Page 24: EMT machine learning 12th weeks : Anomaly detection

Motivating example: Monitoring machine

Page 25: EMT machine learning 12th weeks : Anomaly detection

Motivating example: Monitoring machine

Page 26: EMT machine learning 12th weeks : Anomaly detection

Motivating example: Monitoring machine

Page 27: EMT machine learning 12th weeks : Anomaly detection

Multivariate Gaussian(normal) distribution

Page 28: EMT machine learning 12th weeks : Anomaly detection

Multivariate Gaussian(normal) distribution

Page 29: EMT machine learning 12th weeks : Anomaly detection

Multivariate Gaussian(normal) distribution

Page 30: EMT machine learning 12th weeks : Anomaly detection

Multivariate Gaussian(normal) distribution

Page 31: EMT machine learning 12th weeks : Anomaly detection

Multivariate Gaussian(normal) distribution

Page 32: EMT machine learning 12th weeks : Anomaly detection

Multivariate Gaussian(normal) distribution

Page 33: EMT machine learning 12th weeks : Anomaly detection

Multivariate Gaussian(normal) distribution

Page 34: EMT machine learning 12th weeks : Anomaly detection

Multivariate Gaussian(normal) distribution

Page 35: EMT machine learning 12th weeks : Anomaly detection

Anomaly detection with the multivariate Gaussian

Page 36: EMT machine learning 12th weeks : Anomaly detection

Relationship to original model

Page 37: EMT machine learning 12th weeks : Anomaly detection

Original model vs. multivariate Gaussian

Page 38: EMT machine learning 12th weeks : Anomaly detection

Thanks you!

Q&A