38
MACHINE LEARNING IN PRODUCTION Integrating with the Software Stack Angela Bassa • September 2016

Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

Embed Size (px)

Citation preview

Page 1: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

MACHINE LEARNING IN PRODUCTION

Integrating with the Software Stack

Angela Bassa • September 2016

Page 2: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

TODAY’S AGENDA

Brief Introduction

Learnings in Production

Examples

Questions and Wrap Up

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 20162

Page 3: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

HI! I’M ANGELA.

@angebassa

I run Data Science at EnerNOC, where we’re

changing the way the world uses energy.

3ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 4: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

QUICK INTRO TO

4ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 5: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

Best in class user experience < 2 seconds response time for 3k peak users per hour

Reliability > 99.99% availability for critical apps

Global Architecture Supporting globally deployable applications localized to 15 countries

Scalability 150,000 new data streams by 2016, 8TB new monthly data by 2017

Near Real-Time Instantaneous access to insights derived from real time streaming customer data

DevOps Rapid development of new application capabilities that can be deployed efficiently

5ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 6: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

Metadata

Monthly Bills

Demand Response

Weather Metrics

Energy Consumption

Utility Partners

© KZawadzki

6ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 7: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

7ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 8: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

8ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

© Young Entrepreneur Journal

It’s not personal (and it’s not academic either);

it’s a business.

Page 9: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

RESEARCH & DEVELOPMENT

9ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 10: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

Knowledge is used to design the blueprint for constructing a program that meets specifications.

© WikimediaML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 11: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

Knowledge is used to decide the form a program should take:

• Plant the seed (algorithm) • Feed/water (data) • Reap the plants (programs)

© WikimediaML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 12: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

12ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

© Kono Designs© Unknown

Page 13: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

MISTAKES WERE MADE© Unknown

Page 14: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

Josh Wills Keynote @ DataEngConf SF16

14ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 15: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

THE LOVE

• Acceptance Criteria by all parties

• Meta Cost Functions

• The Right Tool for the Job

• Golden Data Sets

• Testing Harnesses© Charles M. Schulz

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 201615

Page 16: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

LET’S LOOK AT 2 USE CASES:

16ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Flagship: Novel product feature

Under the hood: Anomaly detection and handling

Page 17: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

LET’S LOOK AT 2 USE CASES:

17ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Flagship: Novel product feature

Under the hood: Anomaly detection and handling

Page 18: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

WEATHER NORMALIZATION

18

• What if we could compare buildings in a portfolio across time and space?

• We have developed a novel methodology that delivers even more granularity than the ‘daily’ industry standard.

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 19: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

PROBLEM DEFINITION

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 201619

Build a cloud-based application to allow for apples-to-apples

comparison of energy consumption (across locations, time

frames, building types, etc.) in daily batched transformations at

an hourly resolution.

Page 20: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

ACCEPTANCE CRITERIA

• Weather normalization is a typical (though burdensome) calculation that ASHRAE Engineers perform—and they know when it “looks right” (e.g. similar days have similar values, hot days display a discount and vice versa, etc.)

• Practical implementation required since the normalized data series are calculated from large native interval datasets.

• Actionable with limited training, meaning that the solution has to be robust to small increments of data since we can’t retrain the generative algorithm every day.

20ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 21: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

META COST FUNCTION

• Computation costs• Model explainability• Product requirements• Data availability• Architectural complexity

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Cost = 𝑓( Model, Business )

• Bias/Variance trade-off• Computational complexity• Cross validation• Design optimization

Page 22: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

GOLDEN DATA SET

22

Input Output

𝑎0 𝑎0’

𝑏0 𝑏0’

𝑐0 𝑐0’

… …

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Input

𝑎0

𝑏0

𝑐0

Output

𝑎0’

𝑏0’

𝑐0’

Output

𝑎0’

𝑏0’

𝑐0’

…𝑓(𝑥)=𝐶𝑥+𝜀

Page 23: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

TESTING HARNESSES

23ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Input

𝑎0

𝑏0

𝑐0

Output

𝑎0’

𝑏0’

𝑐0’

…𝑓(𝑥) = 𝐶𝑖 𝑥 + 𝜀 𝑓(𝑥) = 𝐶𝑖+𝜀 𝑥 + 𝜀

Page 24: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

There’s more than one way to skin a cat. But cats are adorable, so why would you want to skin one?!

© Pets4Homes

24ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 25: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 201625

© The Lion’s Choice

THE RIGHT TOOL FOR THE JOB

Page 26: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 201626

Page 27: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

LET’S LOOK AT 2 USE CASES:

27ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Flagship: Novel product feature

Under the hood: Anomaly detection and handling

Page 28: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

DATA QUALITY

Page 29: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

THE THING ABOUT RELIABILITY

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

© World of HDR© Left Handed Guitarists

29

Page 30: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

VEE: VERIFICATION, EDITING, AND ESTIMATION

30

Before After

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 31: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

PROBLEM DEFINITION

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 201631

Need reliable data to underpin all other analytics on

the platform to run on all incoming time series with

minimal latency, and near-realtime performance.

Page 32: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

ACCEPTANCE CRITERIA

• Has to supportable by Production Operations team whenever there’s an issue; could not scale if Data Science needed at 2am

• Conservative data handling required from regulatory perspective, overwriting good data with bad data is unacceptable

• Rigid service-level contracts around each step (configuration, flagging, estimation, etc.) so all teams knew what to work on, what was available when, and what to expect.

32ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 33: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

33ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 34: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 201634

THE RIGHT TOOL FOR THE JOB

© The Lion’s Choice

Page 35: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 201635

© The Lion’s Choice

ONE MORE THING…

Page 36: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

© Kono Designs

36ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

Page 37: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

QUESTIONS?ML in `prd` by @angebassa; probably doesn’t make sense without audio ai.withthebest.com 2016

©Netflix

Page 38: Machine Learning in Production: Integrating with the Software Stack - Angela Bassa

THANKS!

@AngeBassa • September, 2016