클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스

Preview:

DESCRIPTION

 

Citation preview

Data Warehousing & Business Intelligence in the Cloud

Seoul, Korea COEX Convention Centre 24th October 2013

Data Analytics in the

Cloud

Blair Layton

Business Development Manager

(Databases) – Amazon Web Services

(APAC)

The Explosion of Data

Existing Challenges with Analytics

The Cloud

The Explosion of Data

Existing Challenges with Analytics

The Cloud

We are constantly producing more data

• Insert big data infographic here

From all types of industries

Generation

Collection & storage

Analytics & computation

Collaboration & sharing

Take a look a data processing “pipeline”

Generation

Collection & storage

Analytics & computation

Collaboration & sharing

Data is available everywhere, contains customer insight and costs little to generate, but..,

What has changed in this pipeline

Generation

Collection & storage

Analytics & computation

Collaboration & sharing

Highly constrained

Everything else has constraints

Big Gap in turning data into actionable

information

The Explosion of Data

Existing Challenges with Analytics

The Cloud

Provision all your infrastructure and tools before you get results

Challenge 1: Capex Intensive

Source: Oracle technology global price list 11/1/2012

Cost of your infrastructure dictates what analytics you can perform

Most data never makes it to a data warehouse

1990 2000 2010 2020

The Data Analysis Gap

Enterprise Data

Data in Warehouse

Enterprise Data is growing at over 50% yearly

Data Warehousing growing at less than 10% yearly

Most data is left on the floor

Sources: Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares

Setup takes months of planning and work

Challenge 2: Hard to setup, manage and scale

Enterprises average between 3 and 4 DBAs per data warehouse

Gartner: Critical factors in calculating the data warehouse TCO, July 2009

Extending your data-warehouse can be heavy on time and cost

Managing a data analytics platform requires expensive staff

Complex tuning and management skills required

Very hard to move up the stack

These make it extremely hard to move up the Business Intelligence Maturity Stack

The Explosion of Data

Existing Challenges with Analytics

The Cloud

AWS Services

AWS Global Infrastructure

Application Services

Networking

Deployment & Administration

Database Storage Compute

AWS Global Infrastructure

9 Regions

25 Availability Zones

Continuous Expansion

• $5.2B retail business

• 7,800 employees

• A whole lot of servers

Every day, AWS adds enough

server capacity to power that

whole $5B enterprise

Powering the Most Popular Internet Businesses

Broad ecosystem of consulting partners..

We have partners and technologies ready to help

Solving Problems for Organizations Around the World

No Upfront Investment

Replace capital expenditure with variable expense

Low ongoing cost

Customers leverage our economies of scale

Flexible capacity

No need to guess capacity requirements and over-

provision

Speed and agility

Infrastructure in minutes not weeks

Focus on business

Not undifferentiated heavy lifting

Global Reach

Go global in minutes and reach a global audience

37 PRICE REDUCTIONS

Value proposition of the AWS cloud

Architected for Enterprise Security Requirements

“The Amazon Virtual Private Cloud

[Amazon VPC] was a unique option that

offered an additional level of security and

an ability to integrate with other aspects of

our infrastructure.”

Dr. Michael Miller, Head of HPC for R&D

(August 19, 2013)

Gartner “Magic Quadrant for Cloud Infrastructure as a Service,” Lydia Leong, Douglas Toombs, Bob Gill, Gregor Petri, Tiny Haynes, August 19, 2013. This Magic Quadrant graphic was published by Gartner, Inc. as part of a

larger research note and should be evaluated in the context of the entire report.. The Gartner report is available upon request from Steven Armstrong (asteven@amazon.com). Gartner does not endorse any vendor, product or

service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization

and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Gartner Magic Quadrant for Cloud Infrastructure as a Service

The Explosion of Data

Existing challenges with analytics

The Cloud

Data is a competitive edge

Hard and expensive to setup, manage and scale

Lowers cost and improves agility

Summarizing the problem and the opportunity

Data Analytics in the Cloud

Easy and inexpensive to get started

Easy to setup, scale and manage

Low cost to enable analytics on all your data

Open and flexible

The Solution

Technology Process View

Data source 1

Data source n

Unstructured data sources

Extract Transform, Load and Cleanse Data

warehouse

Data source 1

Analytics

Analytics

The diagram above shows functional architecture components of any data warehousing project.

Source systems

Data source 1

Data source n

Unstructured data sources

Extract Transform, Load and Cleanse Data

warehouse

Data source 1

Analytics

Analytics

The diagram above shows functional architecture components of any data warehousing project.

Data Integration

Data source 1

Data source n

Unstructured data sources

Extract Transform, Load and Cleanse Data

warehouse

Data source 1

Analytics

Analytics

The diagram above shows functional architecture components of any data warehousing project.

The Data Warehouse

Data source 1

Data source n

Unstructured data sources

Extract Transform, Load and Cleanse Data

warehouse

Data source 1

Analytics

Analytics

The diagram above shows functional architecture components of any data warehousing project.

Business Intelligence and Analytics

Data source 1

Data source n

Unstructured data sources

Extract Transform, Load and Cleanse Data

warehouse

Data source 1

Analytics

Analytics

The diagram above shows functional architecture components of any data warehousing project.

Data Analytics -Technology Stack

Data Integration

Data Warehouse

Business Intelligence

AWS Cloud

Amazon Redshift

Amazon Redshift

Data warehousing done the AWS way

• Pay as you go, no up front costs

• Fast, cheap, easy to use

• SQL

• Easy to provision Deploy

Customer quotes

“[Amazon Redshift] took an industry famous for its opaque pricing,

high TCO and unreliable results and completely turned it on its head.”

“Redshift is twenty times faster than Hive…The cost saving is even

more impressive…Our analysts like [it] so much they don’t want to go

back.”

“Team played with Redshift today and concluded it is awesome. Un-

indexed complex queries returning in < 10s.”

“Queries that used to take hours came back in seconds. Our analysts

are orders of magnitude more productive.”

Amazon Redshift lets you start small and grow big

Extra Large Node (HS1.XL) 3 spindles, 2 TB, 16 GB RAM, 2 cores

Single Node (2 TB)

Cluster 2-32 Nodes (4 TB – 64 TB)

Eight Extra Large Node (HS1.8XL) 24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigE

Cluster 2-100 Nodes (32 TB – 1.6 PB)

Note: Nodes not to scale

Amazon Redshift Pricing – Singapore & Sydney

Price Per Hour for XL Node ($US)

On-Demand $ 1.25

1 Year Reservation $ 0.75

3 Year Reservation $ 0.45

Simple Pricing

Number of Nodes x Cost per Hour

No charge for Leader Node

Pay as you go

So for example…….

• 1 XL node reserved for 3 years:

= 0.45c x number of hours in a month

= $340 per month

• 1 XL node cluster gives you: • 2 Cores

• 16 GB RAM

• 2 TB Disk

• Plus 2 TB storage in S3 for backups & snapshots

Amazon Redshift is easy to use

• Provision in minutes

• Monitor query performance

• Point and click resize

• Built in security

• Automatic backups

Use cases

• Reporting Data-warehouse behind an OLTP system

• Data Mart to take load off the existing data warehouse

• Log file analysis for clickstream or gaming data (e.g. Advertising, Retail, Gaming)

• Query-able archive for data compliance (e.g. Telco - Call detail Records)

• Machine generated sensor data analysis (e.g. Utility - smart meters, Resources - equipment failure prediction)

• As a data analytics system for live data (Gaming, Advertising)

Amazon Partner Network

(Technology Partners)

Flexibility & choice are key in the Cloud

Application Services

Compute Storage Database

Networking

AWS Global Infrastructure

Deployment & Administration

Thank you

Extending data

integration into the Cloud

Colm Daniel

World Wide Cloud Alliances

Ron Lunasin

Sr. Director – Cloud Product

Management

Today’s Agenda

• Informatica Cloud Overview

• Informatica Cloud Amazon Redshift Connector

• Demonstration

• Next Steps

• Q&A

The Industry Leader in Cloud Integration Informatica:

#1 by Customer Count

2000+ companies

#1 by Customers/Analysts

Gartner AppExchange

#1 by Data Processed

+40B transactions/month

#1 by Connectivity

Informatica Cloud Marketplace

Top Right @ the Core: Gartner Magic Quadrants

Employees in 26 Countries…. and growing!

Global Presence & Global Perspective

New Cloud Connectors

http://www.informaticacloud.com/connectivity

New!

Cloud Integration Customer Success Stories

Synchronizing Salesforce CRM with Netsuite and other business apps

1.5M rows of data synchronized daily

App Integration Data Replication

Decreased operational issues from 70% to 30% of IT workload

Enabled faster, more accurate decision-making based on timely, trusted data

Data Migration

Consolidated Smith Barney and Morgan Stanley data on Day 1 of merger

Managers didn’t lose momentum in ongoing recruiting efforts

Extend PowerCenter

Hybrid deployment gives integration flexibility and scalability to meet various use cases

Lowered time and resources needed for integrations by 80%

iPaaS *(Build)

Reduce time to build and distribute connectivity to 3rd party data sources

Customize cloud integration templates to execute sophisticated integration workflows

Informatica Cloud

The Industry’s Most Comprehensive Cloud Integration

and Data Management Solution

Cloud Integration Connecting your cloud apps

Cloud Data Quality and MDM Delivering the “Single Customer View”

Cloud Process Automation Guiding users to work efficiently with the data

Our Mission:

Unleash the Potential Of the Cloud

Cloud Amazon Redshift

Connector Ron Lunasin, Cloud Platform Adoption

Recognition of “The Next Wave” back in 2004

Move to the Cloud…

IT transitions from skeptic to partner to driver

Increasing IT involvement

in Cloud decision making

Pre-2010

LOB Owned (Outside of IT)

LOB Led (IT Approved)

Business-IT Collaboration

Cloud First (IT Led)

2010-2012

2012-2013

2013

Cloud is the Reality in the Enterprise

90% Cloud decisions and operations

involve IT (IDC)

Driven by IT

Large, Accelerating Market

66% SaaS POs

signed by IT (IDC)

76% enterprises

have a formal cloud strategy

(Forrester)

74% using cloud

will increase cloud spend > 20%

(IDC)

Led by Large

Enterprises 4-6x growth rate of

on-premise IT

20-27% CAGR

$20-40B market (Forrester, IDC, Gartner, 451Group)

84% of net new software is now SaaS

(IDC)

60% of all companies

using SaaS w/in 12 months

(Forrester)

SaaS largest category

PaaS fastest growing

(Forrester)

Informatica Cloud and Amazon Redshift:

Enabling cost-effective data warehousing

• Redshift Connector pre-release announced in February

• General availability in August 2013

InformaticaCloud.com/Amazon-Redshift

What did it use to take…

• Budget large capital expenditure

• Schedule a sales meeting with Oracle, IBM, Teradata, etc…

• Formal POC (Proof of Concept)

• Procure software and hardware

• Install and setup

• Start project

What it takes now…

• Go to the web and sign-up

• Start project!

2

1

Informatica Cloud Architecture Overview

4 Secure Agent

Your Company 3

Marketplace

Amazon Redshift

Informatica Cloud Amazon Redshift demonstration

Firewall

Informatica Cloud Secure Agent

Metadata Mappings

Build mapping and execute job

1

1

Retrieve Account Data 2

2

3 Put Account Data into Flat File

4 Transfer compressed Flat File to S3

5 Initiate copy from S3

6 Load data into Amazon Redshift

6

3

5 4

Best practices to remember…

• The Amazon S3 bucket that holds the data files must be created in the same

region as your cluster

– Files are deleted from Amazon S3 bucket when upload is complete

• Choose a batch size where the number of batches matches the number of

slices in your cluster

– Each XL node has 2 slices, each 8XL node has 16

– If you have a 2 node XL cluster and 40,000 rows of data, choose a batch size of

10,000

– The Informatica Cloud Redshift connector can maximize Amazon’s parallel

processing capabilities this way

Next Steps

• Get started with Amazon Redshift

• Get started with Informatica Cloud

– InformaticaCloud.com

• Learn more about our Redshift Connector

– InformaticaCloud.com/Amazon-Redshift

Q&A Colm Daniel, cdaniel@informatica.com

Ron Lunasin, rlunasin@informatica.com

Thank you

AWS Reporting &

Analysis

Ben Connors

Worldwide Head of Alliances - Jaspersoft

• Analysis of Cloud market motivations

• Overview of Cloud trends

• Cloud User category expectations

• How BI/Jaspersoft fits into Cloud strategies

• Demos

• Summary

© 2013 Jaspersoft Corporation 71

Session Overview

Industry Movement to the Cloud

• Cloud Growth –

– Cloud IT spend will grow from 3% - 17% of total (Morgan Stanley)

• Motivations:

– Agility

– Lower cost

– Faster time to value

– Less risk

• Use cases:

– CRM, ERP, HR, Online Gaming, Manufacturing, Expense Reporting, Big Data, Consumer Applications, Etc.

• Workloads:

– Dev/Test

– ‘Spiky’

– High Growth

– Reliable production

• BI usage matches these Cloud trends

© 2013 Jaspersoft Corporation. 72

Cloud Computing Growth

© 2013 Jaspersoft Corporation. 73 http://www.forbes.com/sites/louiscolumbus/2013/02/19/gartner-predicts-infrastructure-services-will-accelerate-cloud-computing-growth/

Asia/Pacific Cloud Growth

© 2013 Jaspersoft Corporation. 74 http://techaisle.com/blog/2012/11/lots-of-clouds-in-the-forecast-and-a-holiday-story/

Top Cloud Applications

0

10

20

30

40

50

Deployed

In 12 months

• INTERNAL BUSINESS APPLICATIONS TOP THE LIST; MOBILE SITES NEXT

What kinds of applications have you delivered using a cloud environment? Which do you plan to deliver during the next 12 months?

Source: Forrester Cloud Developer Survey, Q3 2012

© 2013 Jaspersoft Corporation. 75

2013: Current/future BI

Cloud adoption trends

© 2013 Jaspersoft Corporation. 76

TechTarget 2013 Analytics & Data Warehousing Reader Challenges & Priorities Survey

Does your organization run or plan to run any part of its BI, analytics and data warehousing

systems in the cloud?

15%

13%

32%

41%

Yes, active cloud user

Plan to start using the cloud in the next 12 months

Considering, but no set plans

No

60% planning, considering, or actively using

N = 559

• The cloud continues to play a critical role in supporting BI, analytics, and DW initiatives with 3 out of 5 respondents reporting that they are planning, considering or actively using the cloud.

• Business User

– Efficient access to IT resources w/o red tape and delays

• Application Developer

– Platform with dev tools, middleware, capacity, configuration mgt.

• IT Operations

– Elastic capacity, secure, standard, keep users happy

• Management

– Control expenses & risk, delight customers/partners, move fast

© 2013 Jaspersoft Corporation 77

Constituents - Cloud Expectations

Example Industry Use Cases

for Business Intelligence

Industry Data Analyzed

Online Gaming # players vs. time, spend/player, popularity of weapons, scene usage

Education Student attendance, test scores, teacher performance, spend/student

Telecom Customer churn, data traffic patterns, billing per service

Government Crime data, demographics, health trends, economic

Advertising Click-through rates, conversion rates, regional variation

Retail Product sales, Profits, Customer traffic, Product correlations

Manufacturing Inventory, quality, vendor performance, logistics

78 © 2013 Jaspersoft Corporation

Current State of Business Intelligence

• Standalone

• Expensive

• Desktop-based

• High Latency

© 2013 Jaspersoft Corporation. 79

Competing on Time and Information

80

“The New Factors of Production: Time and Information” Brian Gentile, Jaspersoft

But business users don’t have access to timely,

actionable data

Why?

Most don’t spend their day inside a BI tool …nor

do they want to!

© 2013 Jaspersoft Corporation.

Embedded BI - Why?

• For Best Decisions, Information Should Be:

– Relevant

– Timely

– Actionable

81 © 2013 Jaspersoft Corporation.

Embedded BI

• Maintains

– Context/Relevance

– Motivation/Timeliness

– Train of thought/Timeliness

– Actionable/Within application or beyond

– Security

• Broadens User Community

– Executives

– More knowledge workers

– Self-serve, Interactive

82 © 2013 Jaspersoft Corporation.

4xC Barriers to

Embedded BI Adoption

© 2013 Jaspersoft Corporation. 83

Complex to Deploy

Cost Complex to Embed

Complex to Use

NEED: Develop for free. Pay only for what you use when

deploy

NEED: Deploy

with push-

button ease or use as a service

NEED: Embed

self-service BI through standard

APIs

NEED: Easy to

build and use BI assets

Simple, Low-Cost Embedded BI

3rd Gen Embedded BI

Breaks Barriers

© 2013 Jaspersoft Corporation. 84

Complex to Deploy

Cost Complex to Embed

Complex to Use

Free + usage-based pricing

HTML5/CSS+ RESTful

web services

Push-button on-premises deployment and Cloud BI service

Easy to build for BI Builders on any data and self-serve for BI Consumers on any device

3rd Generation Embedded BI

We Need “Intelligence Inside”

85

We want information to FIND US, not the other way round

“We need Intelligence Inside the applications and business processes we use every day.”

– Pipeline dashboard inside SaaS CRM app

– Performance report inside partner portal

– Salary data visualizations inside HR intranet

– Portfolio analytics inside client website

– Tickets crosstab inside custom helpdesk app

– Interactive charts inside native mobile app

© 2013 Jaspersoft Corporation.

Embeddable Architecture Open web standard

architecture makes

integration with any

app easy to perform

Cloud Ready Multi-tenant architecture,

100’s of SaaS

customers, top selling BI

solution on Amazon

Affordable Up to 80% less than

traditional BI platforms

while delivering significant

power & capabilities

Proven Platform Millions of users,

380,000 community

members, deployed in

130,000+ applications

Full Self-Service BI Suite Address all user requirements with

interactive reports, dashboards,

analysis, and data integration

Jaspersoft: The Intelligence Inside

Product Overview

Jaspersoft Products

88

Reporting Engine

Visual Report

Design Environment

Ad Hoc Reports, Dashboards,

In-Memory Analysis Server

Powerful OLAP

Data Analysis

Studio

© 2013 Jaspersoft Corporation.

Design Any Report . . .

© 2013 Jaspersoft Corporation. 89

… Dashboard

90 © 2013 Jaspersoft Corporation.

… or Analytic View

91 © 2013 Jaspersoft Corporation.

... Using Any Data Type

POJO files

Relational Files Relational Big Data Files

© 2013 Jaspersoft Corporation. 92

Redshift

© 2013 Jaspersoft Corporation. 93

… bringing Intelligence to Any App

… with a World-Class BI Platform

94

Reporting, Dashboards, Visualization, OLAP Analysis

Columnar-Based In-Memory Engine

Data Connectivity to Any Data

10

0%

Web

Sta

nd

ard

s: C

SS, .

JS, .

JSP,

Jav

a

Exte

nsi

ve A

PIs

: HTT

P, S

OA

P, R

EST

HTML5 Browser, Native Mobile Apps

Business Metadata Layer

Data Integration

Data Virtualization Direct

Redshift EMR On-Premises RDS SaaS

Jaspersoft Customers

Software & Technology

Financial Services

Public Sector

Telecommunications

Travel & Transportation

Manufacturing

Healthcare/Pharmaceutical

© 2013 Jaspersoft Corporation.

Jaspersoft AWS Hourly: 500+ Customers in 6 Months!

95

Jaspersoft/AWS Customer:

BizFlow/Samsung Korea

• Business Process Management (BPM)

• Challenge

– Monitor/Analyze Business Activities

• Solution

– Jaspersoft on Cloud

• Results

– Customers avoid infrastructure

– Increased BizFlow revenue

– Self-service BI

– Higher value analytics

http://www.bizflow.com/business-process-management/samsung-heavy-industries

© 2013 Jaspersoft Corporation. 96

Jaspersoft/AWS Customer:

Sage Human Capital

• Recruiting Firm for High Tech companies

• Challenge

– Visibility for recruiting process status

• Internal

• External

• Solution

– Jaspersoft on AWS

• Results

– Dashboards set up in two hours

– Disrupting the industry “Jaspersoft for AWS allows me to have big company analytics for a small business price. With this information, we can be proactive instead of reactive.” - Paul Grewal, CEO Sage Human Capital

© 2013 Jaspersoft Corporation. 97

Jaspersoft/AWS Customer:

Blue Consulting

• Administration Systems for Schools

• Challenge

– Data from many systems

– Difficult for everyone, including teachers, to access

• Solution

– Jaspersoft on AWS, Amazon Redshift

• Results

– Over 200 schools provide reporting to teachers, even at home

– More informed decisions, educational approaches, resource optimization

“Our users LOVE Jaspersoft ad hoc reporting, and the performance of the system with Redshift.” -Russ Davis, Founder & CEO

98 © 2013 Jaspersoft Corporation.

© 2013 Jaspersoft Corporation. 99

Jaspersoft BI for AWS Overview

Jaspersoft 5 Demo

100 © 2013 Jaspersoft Corporation.

Jaspersoft Integrated with Amazon Redshift

• Jaspersoft is the first BI service that you can buy per hour

– No user limitations, no monthly fee,

– less than $1 per hour

• First BI service to automatically

connect to your AWS data

– 10 minutes from launch to visualizing your data in RDS or Redshift

– AWS Security Integration

• Released February, 2013

– Over 500 customers

101

Jaspersoft Pro on AWS

Thank you

Recommended