49
Big Data Architecture & Business Value TAHEREH SAHEB PHD IN STS FROM RPI. NY ASSISTANT PROFESSOR AT TARBIAT MODARES UNIVERSITY [email protected]

Big data Architecture

Embed Size (px)

Citation preview

Page 1: Big data Architecture

Big Data Architecture & Business ValueTAHEREH SAHEBPHD IN STS FROM RPI. NYASSISTANT PROFESSOR AT TARBIAT MODARES [email protected]

Page 2: Big data Architecture

Big data is an enterprise asset and

needs to be managed as an integrated element of your

current Enterprise Architecture

Page 3: Big data Architecture

بیگ دیتا: کشف الگوها، پیش بینی آینده و ارایه تحلیل های تجویزی

Page 4: Big data Architecture
Page 5: Big data Architecture
Page 6: Big data Architecture
Page 7: Big data Architecture

Big data adds depth to our analysis of events through providing broader and deeper data

Page 8: Big data Architecture
Page 9: Big data Architecture

Big data adds depth to our analysis of events through providing broader and deeper data

چه عواملی باعث می شوند که برخی جاده ها امن تر از سایر جاده ها باشند؟

چه مقدار بارش برف باعث خطرناک شدن اتوبونهای میان ایالتی می شوند؟

آیا فارغ از میزان بارش، برخی از اتوبونها از سایر اتوبونها امن تر می باشند؟

Page 10: Big data Architecture
Page 11: Big data Architecture

آیا عوامل تصادف در سه سال اخیر مانند عوامل پنج سال پیشین می باشد؟از زمانی که میزان حداکثر مجاز سرعت در ایالت خاصی در چهار سال پیش تغییر کرد چه اثراتی بر امنیت جاده ها ایجاد شده است؟

از زمانی که دو اتوبان میان ایالتی در عرض سه سال پیش ساخته شدند، چگونه الگوهای کلی تصادف نسبت به زمان قبل از بازگشایی آنان تغییر کرده است؟

Page 12: Big data Architecture
Page 13: Big data Architecture

Big data adds depth to our analysis of events through providing broader and deeper data

آیا هیچ رابطه ای میان تعداد رانندگان با گواهینامه های رانندگی در ایالت و امنیت جاده وجود دارد؟ اگر بلی، آیا این تغییر پس از بازگشایی دو اتوبان جدید ایجاد شد؟

آیا هیچ رابطه ای میان رانندگانی که در حین آب و هوای بد تصادف کرده اند و رکورد کrلی رانندگی آنان وجود دارد؟

آیا هیچ ارتباطی بین دوری محل کار و منزل و احتمال تصادف وجود دارد؟ اگر بلی، چه زمانی از روز؟ چه روزی از هفته؟

Page 14: Big data Architecture

4-Design Big Data

Architecture

3 -Draw Business

Information Map or Data

Flow DIagram

2-Determine Business Drivers & Problems

1-Envision

Page 15: Big data Architecture

Before designing a big data architecture

استفاده از بیگ دیتا چه نوع بینش و ارزش تجاری به سازمان شماخواهد داد؟

آیا امکان ترفیع انبار داده کنونی شرکت وجود دارد؟ چگونه هزینه گسترش محیط کنونی و یا راه حلهای جدید را ارزیابی

نمایم؟ چه مهارتهایی برای درک و تحلیل ایجاد و نگهداری پروژه های بیگ دیتا

الزم است؟ آیا داده های سازمانی کنونی به ما بینشی در خصوص کسب و کار می

دهد؟

Page 16: Big data Architecture

To answer these questions, use a structured approach for evaluating the viability of a big data solution according to the dimensions shown in the following figure

Business value from the insight that might be gained from analyzing the data

Governance considerations for the new sources of data and how the data will be used

People with relevant skills available and commitment of sponsors

Volume of the data being captured Variety of data sources, data types, and data

formats Velocity at which the data is generated, the speed

with which it needs to be acted upon, or the rate at which it is changing

Veracity of the data, or rather, the uncertainty or trustworthiness of the data

Page 17: Big data Architecture

Phase one: Vision

When we paint a vision of a future architecture, we start with a basic understanding of our current state and we begin to speculate on how it might evolve.

A challenge in many organizations is that at inception, the technical vision may not be aligned to the business vision .

In fact, business visionaries and potential sponsors must drive the technical vision .

So we must understand current business utilization of our current information architecture and how that could change in the future.

The vision phase is mostly about gathering requirements and exploring the art of the possible.

Page 18: Big data Architecture

Phase 2: Determine Business Drivers & Problems

The next phase of the methodology takes a much deeper look at

1 -look at the business drivers & Problems2 -to map the business problem to its big data

type .

Page 19: Big data Architecture

Busi

ness

Pr

obl

em

Bi

g

Data

Type

Bi

g

Data

Characteristics

Page 20: Big data Architecture

Big Data Business Drivers

Some of the key business drivers: Operational effectiveness Delivering a better

customer service Improved Innovation Improved company

growth

Page 21: Big data Architecture

Business Problem: Predicting power consumption

Page 22: Big data Architecture

Mapping the business problem to its big data typeDescription Big Data Type Business

ProblemUtility companies have rolled out smart meters to

measure the consumption of water, gas, and electricity at regular intervals of one hour or less. These smart meters generate huge volumes of interval data that

needs to be analyzed.--------------------------------------

Utilities also run big, expensive, and complicated systems to generate power. Each grid includes

sophisticated sensors that monitor voltage, frequency, and other important operating characteristics.

---------------------------------------

A big data solution can analyze power generation (supply) and power consumption (demand) data using

smart meters.

Machine-generated data Utilities: Predict power

consumption

مصرف گیری اندازه و هوشمند کنتورهای

عملیاتی خصوصیات سایر و ولتاژ گیری اندازه منظور به سنسورها

Business Drive : Operating efficiencyBusiness Problem: Predicting power consumptionBig Data Type: data delivered by the sensors

Page 23: Big data Architecture

Customer Churn!!!!

Page 24: Big data Architecture

TelecommunicationDescription Big Data Type Business

ProblemTelecommunications operators need to build detailed customer churn models

that include social media and transaction data.

Web and social data

Customer Churn

The value of the churn models depends on the quality of customer attributes

(customer master data such as date of birth, gender, location, and income) and

the social behavior of customers.Telecommunications providers who

implement a predictive analytics strategy can manage and predict churn by

analyzing the calling patterns of subscribers.

Transaction data

ریزش مدلهای تهیهاساس بر مشتریانهای شبکه های داده

تراکنش و اجتماعی

ریزش مدلهای ارزشخواص کیفیت به بستگیمانند ) دیتا مستر مشتریان ) و درامد جنسیتو تولد، تاریخمشتریان اجتماعی رفتار.دارد مشتریان زدن زنگ الگوهای تحلیل با ریزش بینی پیش و مدیریت منظور به کننده بینی پیش تحلیل

Page 25: Big data Architecture

Fraud Detection

Page 26: Big data Architecture

Financial Services& Health Care

Description Big Data Type Business Problem

مدیریت تقلب پیش بینی می کند احتمال اینکه یک تراکنش و یا حساب یک مشتری دچار تقلب شده است.

راه حلهای ارایه شده به تحلیل تراکنشها در زمان واقعی پرداخته و پیشنهاداتی را برای اقدام فوری تولید می کنند.

این اقدامات برای متوقف کردن تقلب و سواستفاده عمدی از .حساب ضروری است

Machine-generated data

Fraud Detection

Solutions are typically designed to detect and prevent myriad fraud and risk types across multiple industries, including

Transaction data

Credit and debit payment card fraudDeposit account fraudTechnical fraudBad debtHealthcare fraudMedicaid and Medicare fraudProperty and casualty insurance fraudWorker compensation fraudInsurance fraudTelecommunications fraud

Human-generated

Page 27: Big data Architecture

طبقه بندی مشکالت کسب و کار بر اساس نوع بیگ دیتا به ما در دیدن خصوصیات انواع داده کمک می کند.

این خصوصیاتها می تواند به ما کمک کند تا متوجه شویم:- داده چگونه به دست آمده است؟1- چگونه داده ها در فرمت مناسب پردازش شده است؟2- تا چه اندازه داده ها های جدید قابل دسترس خواهند بود؟3

Business

Problem

Big Data Type

Big Data

Characteristic

s

Page 28: Big data Architecture
Page 29: Big data Architecture

•Analysis type — Whether the data is analyzed in real time or batched for later analysis. A mix of both types may be required by the use

case: Fraud detection; analysis must be done in real time or near real time. Trend analysis for strategic business decisions; analysis can be in batch mode.

• Processing methodology — The type of technique to be applied for processing data (e.g., predictive, analytical, ad-hoc query, and reporting).

Analysis TypeProcessing methodology

Page 30: Big data Architecture

Data Frequency and Size

How much data is expected and at what frequency does it arrive. Knowing frequency and size helps determine the storage mechanism, storage format, and the necessary preprocessing tools.

Data frequency and size depend on data sources: •On demand, as with social media data•Continuous feed, real-time (weather data, transactional data)•Time series (time-based data)

Page 31: Big data Architecture

Data TypeContent Format

•Data type — Type of data to be processed — transactional, historical, master data, and others. Knowing the data type helps segregate the data in storage.

• Content format — Format of incoming data — structured (CRM, for example), unstructured (audio, video, and images, for example), or semi-structured.

Format determines how the incoming data needs to be processed and is key to choosing tools and techniques and defining a solution from a business perspective

Page 32: Big data Architecture

Data Sources

•Data source — Sources of data (where the data is generated) — web and social media, machine-generated, human-generated, etc.

Identifying all the data sources helps determine the scope from a business perspective. The figure shows the most widely used data sources.

Page 33: Big data Architecture

Data ConsumersHardware

•Data consumers — A list of all of the possible consumers of the processed data:

• Business processes• Business users• Enterprise applications• Individual people in various business roles• Part of the process flows• Other data repositories or enterprise

applications

•Hardware — The type of hardware on which the big data solution will be implemented — commodity hardware or state of the art. Understanding the limitations of hardware helps inform the choice of big data solution

Page 34: Big data Architecture

Phase 3: Business Information Maps: data flow?

Once we understand what data our business analysts need, we must figure out where we

should get the data & Which business

processes needs what kind of data ?

Page 35: Big data Architecture

Example: fictitious manufacturer of luxury cars, Lux Motor Cars (LMC)

Lease vehicles in the United States, the following information is uncovered:• The variance between when a car is due for scheduled service and when it is actually serviced is quite high. In other words, drivers don’t seem to adhere to the service schedule as closely as otherLMC owners do.• The number of miles driven after a “Check Engine” light came on was dramatically higher for LMC Lease vehicles. It appears that lessees weren’t taking the “Check Engine” light very seriously.

Page 36: Big data Architecture

LMC is currently not able to put such a program into place. Much of the data they need to run a program of this type is not currently available, though this is not fully understood in the business. So, we need to build a business information map describing the current state so that all can better understand how this part of the business operates with the data that it has today.The good news is that LMC is ready to make a telematics investment.

Big Data Initiative: deploying a telematics system

Page 37: Big data Architecture

First Step: Data Sources

Page 38: Big data Architecture

Second Step: Data Flow

Page 39: Big data Architecture

deploying a telematics system

• Stakeholders will receive from the M&W system the key

performance indicators (KPIs) that have been heretoforeunavailable.• Sensors will exchange data with the M&W system including the interactive diagnostics that include both driver alerts and on-demand logs (interactive diagnostics).• Lessee and / or drivers will receive near real-time (RT) driver alerts and exchange messages to deal with service scheduling(near RT driver alerts and scheduling).

Page 40: Big data Architecture

Phase 4: Drawing future state Big Data architecture

We are ready to design our future state technical architecture

and the IT architecture team will engage extensively in this phase .

However, before a more detailed design is started, we might want to

first understand the skills we have in our organization and the

impact those skills (or lack of) might have in the architecture .

We’ll also want to clearly understand the good and bad things about

our current state architecture and how we might extend it

through the introduction of new software components and

systems

Page 41: Big data Architecture

Current state of Information Architecture

In this example, the Enterprise Data Warehouse (EDW) provides thehistoric database of record. Data is extracted from multiple OLTP systems (the ERP and CRM systems are pictured). Only Structured Data!Data marts surround the EDW. Business analysts accessthe marts and / or EDW using reporting and ad hoc query and analysis tools.

Page 42: Big data Architecture

At the current state, the sources of data are limited to structured data!But the business needs a deeper analysis of consumers, therefore needs a new set of external streaming, unstructured and semi-structured data!

A Business Problem: to better understand the success of promotions and sales efforts.

Page 43: Big data Architecture

Data is gathered in Hadoop after being captured when 1) shoppers buy items and 2) browse on the web site and 3) when they enter the brick-and mortar stores. 4) Sentiment data that expresses the shoppers’ opinion of doing businesswith the company is gathered in Hadoop from social media. The streaming data landed in clusters of NoSQL databases that can easily be scaled for high-ingestion demands andthen loaded into Hadoop for analysis.

Page 44: Big data Architecture

We will next determine if there is a need to query and analyze data residing in our traditional data warehouse information architecture and the Hadoop cluster at the same time.

Page 45: Big data Architecture
Page 46: Big data Architecture

You want to be an on-demand Business?

Add a real-time recommendation engine and why an event

processing engine to guide the shopper by presenting specific

products in the web store while they are engaged

Page 47: Big data Architecture

Example of an event processing

For example, if sensors in the brick-and-mortar store begin to detect delays in reaching cashiers and dissatisfied customers abandoning the items they wanted to buy, predefined rules might trigger devices to signal cashiers who are engaged in other activities to open up additional cash registers and alleviate the backup

Page 48: Big data Architecture
Page 49: Big data Architecture

Big Data Architecture