18

Wrangleconf Big Data Malaysia 2016

Embed Size (px)

Citation preview

Page 1: Wrangleconf Big Data Malaysia 2016
Page 2: Wrangleconf Big Data Malaysia 2016

Overview

● Brief Skymind Intro● Deep Learning outside research● Core trends for ROI in deep learning● Anomaly Detection with deep learning● Simbox fraud detection for telco● Network Intrusion● Fintech securities churn prediction● Real time corporate campus security: Detecting

dangerous objects

Page 3: Wrangleconf Big Data Malaysia 2016

Distributed Deep RL on Spark

We builtDeeplearning4j

Page 4: Wrangleconf Big Data Malaysia 2016

SKYMIND INTELLIGENCE LAYER (SKIL)REFERENCE ARCHITECTURE

Page 5: Wrangleconf Big Data Malaysia 2016

Deep Learning outside research

● Too much hype● Most companies rarely do machine learning let

alone deep learning● Beginners try to jump to deep learning after

andrew ng’s coursera class without first principles

This is not deep learning.

This is deep learning.

Page 6: Wrangleconf Big Data Malaysia 2016

Deep Learning outside research

● Mostly python and r on kaggle● Many learning from udacity● Most deep learning is research stage/enthusiast● Salaried engineers doing DL mostly publishing

papers● Large fight for talent (see google fellowship)

Page 7: Wrangleconf Big Data Malaysia 2016

Deep Learning outside research

● Deep Learning hasn’t penetrated the fortune 2000

● Fortune 2000 wants ROI not cat pictures● Many organizations just NOW starting to take

software seriously let alone data science● Use cases for deep learning still not widely

understood● Large fight for talent (see google fellowship)

Page 8: Wrangleconf Big Data Malaysia 2016

Core trends for ROI in DL

● Mostly funded by adtech companies● Companies doing DL have data from lots of

media data (audio,image,video)● Many companies using DL for ad targeting ● Best use cases are targeting understanding large

scale hidden patterns in data (often cross domain)

● Time series has largely been ignored

Page 9: Wrangleconf Big Data Malaysia 2016

Core trends for ROI in DL

● Initial first attempts at deep learning following papers (no other examples)

● Many companies end up sticking to simpler techniques after trying DL

● Expectations for DL tend to match hype not reality

● Some rare cases exist outside this trend (mainly in asia)

Page 10: Wrangleconf Big Data Malaysia 2016

Core trends for ROI in DL

For more trends see: https://www.oreilly.com/ideas/the-current-state-of-machine-intelligence-3-0

Page 11: Wrangleconf Big Data Malaysia 2016

Anomaly Detection

● “Find the needle in the haystack”● “Find the bad guy”● “The machines about to break!”● “Find the next market rally”● “Take action on said anomaly”

Page 12: Wrangleconf Big Data Malaysia 2016

Anomaly Detection with deep learning

● Both unsupervised and supervised techniques● LSTMs (time series neural net)● Autoencoders (unsupervised)● Expectations for DL tend to match hype not

reality● Some rare cases exist outside this trend (mainly

in asia)

LSTM

AutoEncoder

Page 13: Wrangleconf Big Data Malaysia 2016

Simbox fraud for telco

● Costs telco over 3 billion yearly ● Route calls for free over a carrier network● Need to mine raw call detail records to find● Find and cluster fraudulent CDRs with

autoencoders (unsupervised)● Beats current rules and supervised based

approaches

Page 14: Wrangleconf Big Data Malaysia 2016

Network Intrusion

● Raw web log traffic ● Detect attacks at points of origin ● Typically supervised learning● Goal: Classify raw time series to find attacks● Optional: Detect *kind* of attack

Page 15: Wrangleconf Big Data Malaysia 2016

Fintech securities churn prediction

● Predict when user is going to leaveservice● Using recurrent nets find likelihood of leaving ● Using lift curves identify budget for sending

discounts to percentage of users “worth” saving● Optional: use autoencoders with kmeans toidentify groups of users wanting to leave

Page 16: Wrangleconf Big Data Malaysia 2016

Corporate campus security

● At 30 FPS or more find dangerous objects in a crowd

● Identify a target object and send immediate report

● Uses variants of Convolutional nets● Imagine hooking this up to a real camera

Page 17: Wrangleconf Big Data Malaysia 2016

Conclusion

● Deep Learning still young● Many use cases not being tried● Research is moving faster every year● Talent still hard to find● Will become more common with time

Page 18: Wrangleconf Big Data Malaysia 2016