28
1 © 2015 The MathWorks, Inc. 인공지능 및 엔터프라이즈 환경 개발을 위한 빅데이터 개발 프레임워크 성호현

인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

1© 2015 The MathWorks, Inc.

인공지능및엔터프라이즈환경개발을위한빅데이터개발프레임워크

성호현

Page 2: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

2

A path for how your team can

better work with and utilize

data.

Page 3: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

3

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Generally Useful

Tools for

Analysis

Common

Infrastructure;

Tested and

Documented

Overhead when asking

a new question

Ease of scaling

to more people

Page 4: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

4

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

Generally Useful

Tools for

Analysis

• Goal is to be fast: reduce time to insight

Page 5: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

5

Getting Started: Exploring a New Dataset

Page 6: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

6

Getting Started: Exploring a New Dataset

Page 7: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

7

Getting Started: Exploring a New Dataset

Page 8: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

8

Getting Started: Exploring a New Dataset Missing Dataismissing

rmmissing

fillmissing

Outliersisoutlier

rmoutliers

filloutliers

Change Pointsischange

Noisy Datasmoothdata

and more…

https://www.mathworks.com/help/

matlab/preprocessing-data.html

Page 9: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

9

Getting Started: Exploring a New Dataset

Page 10: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

10

Getting Started: Exploring a New Datasetgeoplot

geoscatter

geobubble

geodensityplot

https://www.mathworks.com/help/

matlab/geographic-plots.html

Page 11: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

11

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

• Explore and understand data

• Document analysis

• Tools will be re-used in next steps

Generally Useful

Tools for

Analysis

Page 12: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

12

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

• Apply to different datasets

• Functions/Scripts

• MATLAB Apps

• Trend: Work with BIG DATA

Generally Useful

Tools for

Analysis

Page 13: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

13

Overview of Flight Data

▪ 35 unique aircraft

▪ 180,000 unique flights

▪ 300 GB of data

▪ Source:

– NASA Dash Link: Sample Flight Data

– https://c3.nasa.gov/dashlink/projects/85/

Page 14: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

14

Big Data Creates Opportunities

Find rare events, then dive deeper

Build and validate test scenarios that match real-world conditions

Perform fleet-wide calculations

Page 15: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

15

Big Data Requires New Tools

Built-In Datastores

General datastore

spreadsheetDatastore

tabularTextDatastore

fileDatastore

Database databaseDatastore

Image imageDatastore

denoisingImageDatastore

randomPatchExtractionDatastore

pixelLabelDatastore

augmentedImageDatastore

Audio audioDatastore

Predictive

Maintenance

fileEnsembleDatastore

simulationEnsembleDatastore

Simulink SimulationDatastore

Automotive mdfDatastore

Page 16: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

16

Big Data Requires New Tools

▪ Customize a datastore to work with

your dataset

▪ Gives you control over how data is

loaded and formatted

▪ MATLAB subclass: “fill-in-the-blanks”

▪ Build a piece of infrastructure, then re-

use it in your analyses

function [data,info] = read(ds)

...

end

function tf = hasdata(ds)

...

end

function reset(ds)

...

end

function p = progress(ds)

...

end

function data = readall(ds)

...

end

Custom Datastore

Page 17: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

17

A Custom Datastore for Flight Data

Page 18: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

18

Find Rare Events, then Dive Deeper

Page 19: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

19

Perform Fleet-Wide Calculations

Page 20: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

20

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

• Make it easy to navigate the data

• Re-use each time you analyze the dataset

Generally Useful

Tools for

Analysis

Page 21: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

21

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

Generally Useful

Tools for

Analysis

• Collaborate: Work with others on a common code base

• Verify: Write well-tested software

• Share: Build tools for others

Page 22: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

22

MATLAB Projects

Page 23: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

23

Testing

Page 24: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

24

Creating a Toolbox

Page 25: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

25

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

• Scale-out to larger group of users

• Easier to maintain and share

Generally Useful

Tools for

Analysis

Page 26: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

26

What’s Next?

Advanced Analytics and Machine Learning

Build and Test Algorithms for

Embedded Systems

Deploy Apps and Analytics to

Enterprise IT Systems

Page 27: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

27

Takeaways

▪ MATLAB has many new tools to help you better work with and utilize

your data

▪ Create tools for you / your team / your organization to explore and

analyze data

▪ Increasing maturity with data science is a journey; we’re here to help

Page 28: 인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data Science Maturity Levels Ad-hoc Individual Analysis Generally Useful Tools for Analysis

28© 2015 The MathWorks, Inc.

데모부스와상담부스로질문하시기바랍니다.

감사합니다