68
1 泛大数据时代的 Oracle 解决之道 贺辉群 David He Industry Solution Manager for Big Data Oracle Enterprise Architecture

泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

1

泛大数据时代的 Oracle 解决之道贺辉群 David He

Industry Solution Manager for Big Data

Oracle Enterprise Architecture

Page 2: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

2

大数据是什么?

Page 3: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

3

什么是大数据?

Page 4: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

4

大数据定义

Big Data: Techniques and

Technologies that Enable Enterprises

to Effectively and Economically

Analyze All of their Data

- IDC, Carl Olofson

Page 5: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

5

写模式和读模式 Schema on Write vs. Schema on Read

Traditional “Schema on Write”

– Required data must first be identified and

modelled in a “schema”

– Data is integrated and loaded via ETL

– Value realized only after this is done

Big Data “Schema on Read”

– Required data captured in code for each

program accessing the data

– Data is integrated in code in map/reduce

framework

– Value realized faster

获得数据价值的时间以及数据分析的灵活性

Page 6: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

6

能力评估

0

1

2

3

4

5Tooling maturity

Stringent Non-Functionals

ACID transactions

Security

Variety of data formats

Data sparsity

ETL complexity

Cost effectively store low value data

Ingestion rate

STP

Hadoop on BDA

Oracle on Exa

Hadoop vs. RDBMS

Page 7: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

7

统一的数据分析平台Unified Analytics API

SQL R MR

Unified Analytics Processing Platform

Hadoop RDBMS

IB

Management Framework and Tools

Page 8: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

8

大数据和分析

Page 9: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

9

全量数据

更好的决策

更快的执行

大数据分析

9

Page 10: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

10

大数据用户案例

找到未知的关系

关联不同的数据结果集

发现机会降低成本

Page 11: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

11

发现和洞察The Value of Not Requiring a Pre-defined Schema

Customers System A

ProductsSystem C

OrdersSystem D

Social MediaSystem D

Call CenterSystem E

Derived MetricsCommon across some systemse.g. Sentiment Score, Avg ResolutionTime, Customer Satisfaction

Unique Dimensions or MetricsCustomer type, Age, Profitability, Fidelity

Support Jagged dataFor diverse structuressuch as product specs

Unique Dimensions or MetricsThemes, Competitors , Klout

Table-free = 不需要过度架构、自适应、灵活的数据分析架构

Global dimensionsCommon across all systemse.g. Product ID, Period, Location, Themes, Customer ID

Global MetricsCommon across some systemse.g. Cost, Count

Page 12: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

12

大数据激发新的洞察

Correlations and patterns from

disparate, linked data sources yield

the greatest insights and

transformative opportunities - Gartner

Page 13: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

13

大数据和分析区分报表和分析

Descriptive Predictive

Reporting Analytics

Dashboards

Hindsight

What happened?

Shows Results

Relational / OLAP

Visualization

Insight

What will happen?

Predicts Results

Hadoop / NoSQL

Page 14: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

14

Three Big Data Differences

Scale Trumps Smarter

– No More Sampling

– Large Data Sets + Simple Algorithms > Samples + Complex Algorithms

Scale Trumps Better

– The Real World is Messy

– Large Data Sets w/ Bad Data > Small Data Sets w/ No Bad Data

Correlation is More Important Than Exactitude

– We Are After Trends, Not Values

– This is not for your billing system

Suggested Reading: Big Data A Revolution That Will Transform How We Live, Work and Think

by Kenneth Cukier and Viktor Mayer-Schonberger

This Requires a Change in How You Think!

Page 15: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

15

金融大数据

Page 16: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

16

金融大数据的应用模式

IT优化

– 更好、更快、更经济、更合理的去管理和处理数据

大数据分析

– 分析所有的数据,不论规模、结构和速度

业务流程转换

– 使用大数据提升现有运作流程

Page 17: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

17

IT优化

Page 18: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

18

Big Data Usage PatternETL and Batch Processing Workloads on Hadoop

Integrate

SQL

SQL

NoSQL

• Scalable

• Flexible

• Cost

Effective

DW & BI

Analytics

Web

Mainframe

Page 19: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

19

Objectives

Large US Regional Bank

Comply with regulations requiring more

data to support stress testing

Reduce IT costs & streamline processing

by eliminating duplicate data stores

Solution

Single, reliable BDA/Exadata-based ODS

supporting all downstream systems

Landing zone & archival repository for

both structured & unstructured data

Use Exadata as “19th” BDA node Operational Data StoreMainframe, RD

BMS, more

BDA Exadata

• Agile business

model

• All data

• De-normalized

& Partial-

normalized

• Normalized

• Aggregate data

• EDW

Oracle Enterprise Manager

Oracle Data Integrator

Data Delivery

Master

S1

Master

S2

Master

SnSOA/API

CRMS

Other

Results & Benefits

Fast access to 85% more data

Lower costs, simplified architecture

and fast time to value

Page 20: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

20

Thomson Reuters

Objectives

Maximize cross-sell opportunities

Lower cost and complexity

Solution

Economically capture all customer activity

Testing 50M events/sec ingest rates into

the Oracle Big Data Appliance

Feeds Exadata EDW for customer

profitability & segmentation analysis

Rick KingChief Operating Officer for Technology

Thomson Reuters

“Oracle's engineered systems… are geared

toward high performance big data delivery - and

that is exactly the type of work we do”

BDA Exadata Exalytics

EDW

Sandbox & DR

Event Capture

& Store

Interactive

Analytics

Research

Applications

Upsell/Cross Sell

Page 21: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

21

Big Data Usage PatternExpand Data Warehouse with Granular Data Store

MartsData Warehouse

Σ Σ

Business

Intelligence

Archiving

• Online

• Scalable

• Flexible

• Cost

Effective

Data Factory

Page 22: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

22

End-to-end business information environment that provides accurate, transparent and timely information to shareholders, regulators and management

Objectives

Tier 1 Global BankNew Information Management Architecture

Results & Benefits

Reduce complexity and risk of changes

Reduce cost of operation

Increased stability & performance

Solution

7 Exadata Racks

16 Node Hadoop Cluster – 33TB

Oracle Loader for Hadoop

Page 23: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

23

大数据分析

Page 24: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

24

Ad-hoc

Big Data Usage PatternScale-out Information Discovery

• Online

• Scalable

• Flexible

• Cost

Effective

Data Factory

Continuous On-Demand

Page 25: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

25

Enable customers to learn about stocks and increase buying confidence

Cultivate the advisor-client relationship online and acquire smaller clients

Objectives

Credit SuisseIncreased sales through instant access to information

Results & Benefits

Incremental sales for Bank based on this

application for 5 years.

Improved customer relationships

Solution

Information Discovery on pooled research

data sets in multiple unstructured formats

Oracle powers their internal application that

advisors utilize to quickly find information on

financial metrics

Page 26: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

26

Big Data Usage PatternInstant Responses based on Historical Analysis

Business

Intelligence

• Online

• Scalable

• Flexible

• Cost

Effective

Integrate

Event Decisions

Page 27: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

27

Solution ArchitectureReal-time Personalized Offers

Extr

act,

Tra

nsfo

rm a

nd

Lo

ad

Front Office

Channel Systems

Call Center

Reporting / Analytics

Oracle Endeca / Oracle Business Intelligence /

Oracle R Enterprise / Oracle Exalytics

Customer

Database

Content

Recommendations

Oracle Big Data Appliance

Co

nte

nt

Pre

sen

tmen

t / D

isp

osit

ion

ATM

Branch

Online

Mobile

Back Office Systems

External Data

Debit Card Transcations

Credit Card Transcations

Customer CRM Data

Reference Data

Clickstream Data

Card Merchant Data

Social Data

Ora

cle

RT

D /

Ora

cle

OE

P /

Ora

cle

Fus

ion

Mid

dlew

are

Ora

cle

Dat

a In

tegr

ator

Cloudera / Oracle NoSQL / Big Data

Connectors

Page 28: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

28

Omni-Channel Offers with 360 View of CustomerPersonalized Offers to Any Channel in Real-time

Real-time profile updates

Self-learning, closed loop model

Best-in-class modeling across

structured and unstructured data

Add new dimensions in your

recommendation process

One View of Customer

Deliver highly personalized offers to

any channel in real-time

Channel Systems

Call Center

ATM

Branch

Online

Mobile

Page 29: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

29

New Wholesale Bank InitiativesProvide Wholesale Merchant Offers & Mobile Payments

Based on Location and individual

customer preferences

Millions of Customers X Thousands

of Merchant Offers

Protect Payments and Drive

Wholesale Deposits

Become Trustee for your Customer’s

Commercial Identity

Page 30: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

30

Real-time Location-Based OffersTier 1 Global Bank

Objectives

Customer profile enrichment with Big Data

Capture credit card POS and merchant data with

event processor

Determine geo location of POS and nearby bank

wholesale customers

Leverage real-time decision engine to generate

offer to mobile device

Solution

Increase revenue through real-

time, location based offers

Page 31: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

31

业务流程转换

Page 32: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

32

Oracle Financial Services Analytical ApplicationsAnalytical Tools for Banking, Capital Markets and Insurance

Performance Management & Finance

Model Risk

V2 061912

Performance Management

Customer Insight

Governance & Compliance

Risk Management

Hedge Management IFRS 9 – IAS 32/39

ICAAP/Risk Appetite

Customer Profitability

Stress Testing

Loan Loss Forecasting Pricing Management

Risk Adjusted Performance

Know Your Customer

Risk Management

Operational Risk & Compliance Mgt. Regulatory Compliance (Financial Crime)

Customer Insight

Anti-Money Laundering

Trading Compliance Broker Compliance

Fraud Detection Operational Risk

Credit Risk

Institutional Performance

Retail Performance

Marketing

Customer Segmentation

Capital Management

Liquidity Risk

Economic Capital Advanced (Credit Risk)

Operational Risk Economic Capital

Balance Sheet Planning

Profitability

Asset Liability Management

Market Risk

Basel Regulatory Capital

Retail Portfolio Models and Pooling

Funds Transfer Pricing

Reconciliation

Channel Insight

Compliance Risk Business

Continuity Risk

Counterparty Risk

Audit

FSDF

Page 33: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

33

OFSAA – Current Architecture

Page 34: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

34

Oracle大数据解决方案

Page 35: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

35

Oracle Big Data Solution

Stream Acquire – Organize – Analyze

Oracle BI Foundation Suite

Oracle Real-TimeDecisions

Endeca Information Discovery

Decide

Oracle Event Processing Oracle Big Data

Connectors

Oracle DataIntegrator

Oracle

Advanced

Analytics

Oracle

Database

Oracle

Spatial

& Graph

Apache Flume

OracleGoldenGate

Oracle

NoSQL

Database

Cloudera

Hadoop

Oracle R

Distribution

Page 36: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

36

Why Make Big Data a Divided World?

VS

Page 37: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

37

Unified Data Analytics EnvironmentUnified Analytics API

SQL R MR

Unified Analytics Processing Platform

Hadoop RDBMS

IB

Management Framework and Tools

Page 38: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

38

使用SQL跨Oracle和Hadoop联合分析数据

SQL Analytics on ALL data

Expand the data pool for

analytics leveraging Hadoop

Stream Hadoop resident data

through Big Data Connectors

for SQL processing

Use the full power of Oracle

SQL on all data

Or use Oracle Loader for

Hadoop to integrate data in

Oracle Database

SQL

Hadoop Oracle Database

IB

Page 39: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

39

使用R跨Oracle和Hadoop联合分析数据R Analytics on ALL data

Expand the data pool for

analytics leveraging Hadoop

Improve scalability and

performance for R without

changes to your programs

Dynamically leverage Hadoop

through Big Data Connectors

to execute R analytics

R

Hadoop Oracle Database

IB

Page 40: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

40

统一数据分析平台

Real-Time

Analytics

Thousands of

Users

Secure and

Available

All Data On-

line and

Ready to Use

Large Scale

Systems

Cost Effective

Page 41: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

41

Logical Architecture

Page 42: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

42

Solution ArchitectureReal-time Personalized Offers

Page 43: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

43

Oracle Big Data Solution

Oracle BI Foundation Suite

Oracle Real-TimeDecisions

Endeca Information Discovery

Decide

Oracle

Advanced

Analytics

Oracle

Database

Oracle

Spatial

& Graph

Acquire – Organize – Analyze

Oracle Big Data Connectors

Oracle DataIntegrator

Stream

Oracle Event Processing

Apache Flume

OracleGoldenGate

Oracle

NoSQL

Database

Cloudera

Hadoop

Oracle R

Distribution

Scalable key-value store

Scalable, low-cost data storage

and processing engine

Statistical analysis framework

Page 44: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

44

Hadoop

The Apache Hadoop software library is a framework that allows for the

distributed processing of large data sets across clusters of computers

using simple programming models. Hadoop is designed to scale up from

single servers to thousands of machines, each offering local

computation and storage. Rather than rely on hardware to deliver high-

availability, the library itself is designed to detect and handle failures at

the application layer, so delivering a highly-available service on top of a

cluster of computers, each of which may be prone to failures.

Framework for distributed processing

Large Data Sets

Clusters of Computers

Simple Computing Models

Highly Available Service

Page 45: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

45

Big Data Appliance X3-2

Sun Oracle X3-2L Servers with per server:

• 2 * 8 Core Intel Xeon E5 Processors

• 64 GB Memory

• 36TB Disk space

Integrated Software:

• Oracle Linux

• Oracle Java JDK

• Cloudera Distribution of Apache Hadoop (CDH)

• Cloudera Manager

• Oracle R Distribution

• Oracle NoSQL Database

All integrated software (except NoSQL DB CE) is supported as part of Premier Support for Systems and Premier Support for

Operating Systems

Page 46: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

46

Big Data Appliance 产品家族

Starter Rack is a fully cabled and

configured for growth with 6 servers

In-Rack Expansion delivers 6 server

modular expansion block

Full Rack delivers optimal blend of

capacity and expansion options

Grow by adding rack – up to 18 racks

without additional switches

Page 47: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

47

Divide Full Rack BDA in multiple clusters

Provide more flexible configurations for customers

Automatic reconfiguration when expanding the cluster

灵活的配置

6 Node Cluster

12 Node Cluster

Example Configuration

Page 48: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

48

数据多份冗余存储

没有NameNode单点故障

NameNode自动故障切换

Metadata多份数据同步

Oracle Big Data Appliance高可用解决方案Cloudera CDH 4.1

Active Name Node

Passive Name Node

Page 49: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

49

Engineered for Quicker Time to Value at Lower Cost

http://www.oracle.com/us/corporate/analystreports/industries/esg-big-data-wp-1914112.pdf

ESG believes that a "buy" versus "do-it-yourself"

approach will yield roughly one-third faster time-

to-market benefit improvement...

0

5

10

15

20

25

30

Oracle Big Data Appliance Build it yourself

Time to Market (Weeks)

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

Oracle Big Data Appliance Build it yourself

Cost: Initial Infrastructure/Tasks

[…] nearly 40% cost savings versus IT

architecting, designing, procuring, configuring, an

d implementing its own big data infrastructure.

Page 50: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

50

Mammoth一键安装配置

BDA’s Single Command Install, patch and upgrade Utility

– Distributes the binaries and installs all BDA software based on a set of

configuration specifications

– Sets all optimized parameters for OS, JVM, Hadoop and Oracle SW

– Applies (one-off) patches and Updates for:

OS, Kernel, JDK and Firmware (switch, HBA, disk controllers etc.)

Cloudera Software Stack and required components

Oracle NoSQL Database and Oracle Big Data Connectors

Specifically built by Oracle for BDA

Page 51: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

51

集成管理框架Management Infrastructure combines EM and Cloudera Manager

Quick view of Hardware and Software status

in Oracle Enterprise Manager

Page 52: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

52

Oracle Audit Vault and Database Firewall

DatabasesRelational Data

HadoopNon-Relational Data

Operating Systems

Audit Vault

OneConsolidated, secure repository for all audit data

Centralized platform for audit reporting, alerting and policy management

Page 53: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

53

Oracle Big Data Appliance

Oracle Audit Vault monitoring enabled at or after

BDA installation

Capture all HDFS access and MapReduce

activity

– Who initiated the activity

– What data was accessed

– When did the activity take place

Page 54: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

54

Kerberos Integration

Kerberos Pre-Configured upon Install

– Point at an external Key Distribution Center

– Install HA Key Distribution Center on the BDA

Strong authentication for

– All Hadoop services

– Oracle Big Data Connectors

Ensure that users are who they claim to be

Ensure authentication across the enterprise

Automatic Authentication

Page 55: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

55

LDAP and Network Encryption

LDAP

– Link Kerberos to existing LDAP services

– Simplify permissions management for all Hadoop services

– Centrally manage permissions for the enterprise

Network Encryption1

– Ensure that data in-motion is protected

– Data moved within Hadoop jobs is encrypted

– Simple installation choice for Big Data Appliance

Secure Transmission, Integrated Authentication

1 Currently planned for release 2.3.1

Page 56: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

56

Oracle Loader for Hadoop

REDUCE

REDUCE

REDUCE

MAP

MAP

MAP

MAP

MAP

MAP

REDUCE

REDUCE

ORACLE LOADER FOR HADOOP

超高速数据加载

利用Hadoop并行能力,降低数据库CPU加载负荷

在线模式和离线模式

连续不断的数据输入

Oracle Data

Warehouse

SHUFFLE

/SORT

SHUFFLE

/SORT

高达15TB/小时

Page 57: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

57

Oracle Loader for Hadoop支持多种数据源

Oracle Data

Warehouse

SHUFFLE

/SORT

SHUFFLE

/SORT

REDUCE

REDUCE

REDUCE

MAP

MAP

MAP

MAP

MAP

MAP

REDUCE

REDUCE

ORACLE LOADER FOR HADOOPDelimited

text files

Hive tables

User written

input format

Various data

sources

Oracle NoSQL

Database

Page 58: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

58

Oracle SQL Connector for Hadoop从 Oracle 数据库直接标准SQL访问Hadoop上的数据

对 Hadoop上的数据通过标准全功能 SQL 进行访问

数据库和Hadoop上的数据进行关联查询

更低延时的Hadoop数据访问解决方案

DCH

外部表

DCHOSCH

SQL 查询

InfiniBand

HDFS 客户端

HDFS Oracle 数据库

Page 59: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

59

Oracle Data IntegratorSimplify Map Reduce

Automatically generates

MapReduce code

High performance loads into

Data Warehouse leveraging

both OLH and OSCH

Manages the process across

platforms

OLH

&

OSCH

Oracle

Data

Integrator

Page 60: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

60

Oracle Big Data Connector Hadoop、NoSQL与RDBMS的融合

HIVE

HDFS

HDFS

Datafile_part_1

Oracle Database

Oracle SQL

Connector

for Hadoop

外部表

SQL查询

聚合

KVInputFormat外部表

Oracle

NoSQL

Database

Hadoop

Oracle

Data

Integrator

Datafile_part_x

Oracle

Loader

Hadoop

关系型结构化数据

• Oracle Loader for Hadoop

• Oracle SQL Connector for Hadoop

• Oracle Data Integrator Application for Hadoop

• Oracle R Connector for Hadoop

Page 61: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

61

Oracle XQuery for Hadoop

Acquire – Organize – Analyze

Oracle Big Data Connectors

Oracle DataIntegrator Oracle

Loaderfor

Hadoop

OXH is a transformation engine for Big Data

XQuery language executed on the Map/Reduce framework

XQuery

for $ln in

text :collect ion()

let $f :=

tokenize($ln)

where $f[1] = 'x '

return

text :put ($f[2] )

Map/Reduce

Execut ion Plan

M/R

M/R

M/R

M/R

Map/Reduce

Worker Nodes

HDFS

OXH

Engine

Page 62: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

62

Oracle XQuery for Hadoop

Ease of Use

Parallel distributed parsing of big XML files

Standard declarative transformation language

Comprehensive support for nested data structures

No schema setup required

Rich built-in function library

Extensible with user-defined Java functions

Multiple output destinations from a single query

Key Features

Page 63: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

63

Oracle XQuery for HadoopInput / Output Data Formats

Input

HDFS

Oracle

NoSQL DB

Text

CSV

JSON

Avro

XML

Output

HDFS

Oracle

NoSQL DB

Text

CSV

JSON

Avro

Oracle

NoSQL DBXML

Oracle Database

Map/Reduce Job Counters

Page 64: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

64

Oracle R Enterprise

Oracle R Enterprise brings R’s statistical functionality closer to the data

1. Eliminate R’s memory constraint by enabling R

to work directly & transparently on database objects

– Allows R to run on very large data sets

2. Architected for Enterprise production infrastructure

– Automatically exploits database parallelism without require parallel R

programming

– Build and immediately deploy

3. Oracle R leverages the latest R algorithms and packages

– R is an embedded component of the DBMS server

– R will run across your Hadoop cluster *

* Future feature

Page 65: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

65

Oracle Advanced Analytics

Oracle Advanced Analytics extends Oracle Database

into a comprehensive analytical platform

– Predictive analytics, data mining, text mining, statistical

analysis,

advanced numerical computations

Scalable and parallel: analyze huge volumes of data

Tightly integrated with SQL: share results of analytics

throughout enterprise

Built for data analysts

Page 66: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

66

Oracle NoSQL Database

Simple Key-Value Data Model

Horizontally Scalable

Highly Available

Simple administration

ACID Transactions at scale

Transparent load balancing

Elastic Configuration

Commercial grade software and support

Scalable, Highly Available, Key-Value Database

Application

Storage NodesDatacenter B

Storage NodesDatacenter A

Application

NoSQL DB Driver

Application

NoSQL DB Driver

Application

Page 67: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

67

Page 68: 泛大数据时代的 Oracle 解决之道 · – We Are After Trends, Not Values – This is not for your billing system Suggested Reading: Big Data A Revolution That Will Transform

68