인공지능및엔터프라이즈 환경개발을위한빅데이터 개발프레임워크 · 3 Data...

Preview:

Citation preview

1© 2015 The MathWorks, Inc.

인공지능및엔터프라이즈환경개발을위한빅데이터개발프레임워크

성호현

2

A path for how your team can

better work with and utilize

data.

3

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Generally Useful

Tools for

Analysis

Common

Infrastructure;

Tested and

Documented

Overhead when asking

a new question

Ease of scaling

to more people

4

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

Generally Useful

Tools for

Analysis

• Goal is to be fast: reduce time to insight

5

Getting Started: Exploring a New Dataset

6

Getting Started: Exploring a New Dataset

7

Getting Started: Exploring a New Dataset

8

Getting Started: Exploring a New Dataset Missing Dataismissing

rmmissing

fillmissing

Outliersisoutlier

rmoutliers

filloutliers

Change Pointsischange

Noisy Datasmoothdata

and more…

https://www.mathworks.com/help/

matlab/preprocessing-data.html

9

Getting Started: Exploring a New Dataset

10

Getting Started: Exploring a New Datasetgeoplot

geoscatter

geobubble

geodensityplot

https://www.mathworks.com/help/

matlab/geographic-plots.html

11

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

• Explore and understand data

• Document analysis

• Tools will be re-used in next steps

Generally Useful

Tools for

Analysis

12

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

• Apply to different datasets

• Functions/Scripts

• MATLAB Apps

• Trend: Work with BIG DATA

Generally Useful

Tools for

Analysis

13

Overview of Flight Data

▪ 35 unique aircraft

▪ 180,000 unique flights

▪ 300 GB of data

▪ Source:

– NASA Dash Link: Sample Flight Data

– https://c3.nasa.gov/dashlink/projects/85/

14

Big Data Creates Opportunities

Find rare events, then dive deeper

Build and validate test scenarios that match real-world conditions

Perform fleet-wide calculations

15

Big Data Requires New Tools

Built-In Datastores

General datastore

spreadsheetDatastore

tabularTextDatastore

fileDatastore

Database databaseDatastore

Image imageDatastore

denoisingImageDatastore

randomPatchExtractionDatastore

pixelLabelDatastore

augmentedImageDatastore

Audio audioDatastore

Predictive

Maintenance

fileEnsembleDatastore

simulationEnsembleDatastore

Simulink SimulationDatastore

Automotive mdfDatastore

16

Big Data Requires New Tools

▪ Customize a datastore to work with

your dataset

▪ Gives you control over how data is

loaded and formatted

▪ MATLAB subclass: “fill-in-the-blanks”

▪ Build a piece of infrastructure, then re-

use it in your analyses

function [data,info] = read(ds)

...

end

function tf = hasdata(ds)

...

end

function reset(ds)

...

end

function p = progress(ds)

...

end

function data = readall(ds)

...

end

Custom Datastore

17

A Custom Datastore for Flight Data

18

Find Rare Events, then Dive Deeper

19

Perform Fleet-Wide Calculations

20

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

• Make it easy to navigate the data

• Re-use each time you analyze the dataset

Generally Useful

Tools for

Analysis

21

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

Generally Useful

Tools for

Analysis

• Collaborate: Work with others on a common code base

• Verify: Write well-tested software

• Share: Build tools for others

22

MATLAB Projects

23

Testing

24

Creating a Toolbox

25

Data Science Maturity Levels

Ad-hoc

Individual

Analysis

Common

Infrastructure;

Tested and

Documented

• Scale-out to larger group of users

• Easier to maintain and share

Generally Useful

Tools for

Analysis

26

What’s Next?

Advanced Analytics and Machine Learning

Build and Test Algorithms for

Embedded Systems

Deploy Apps and Analytics to

Enterprise IT Systems

27

Takeaways

▪ MATLAB has many new tools to help you better work with and utilize

your data

▪ Create tools for you / your team / your organization to explore and

analyze data

▪ Increasing maturity with data science is a journey; we’re here to help

28© 2015 The MathWorks, Inc.

데모부스와상담부스로질문하시기바랍니다.

감사합니다