52

PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 2: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 3: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 4: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

微软云上数据平台概括

Page 5: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Action

People

Automated Systems

Apps

Web

Mobile

Bots

Intelligence

Dashboards &

Visualizations

Cortana

Bot

Framework

Cognitive

Services

Power BI

Information

Management

Event Hubs

Data Catalog

Data Factory

Machine Learning

and Analytics

HDInsight

(Hadoop, Spark,

Storm, HBase

Managed Clusters)

Stream Analytics

Intelligence

Data Lake

Analytics

Machine

Learning

Big Data Stores

SQL Data

Warehouse

Data Lake Store

Data Sources

Apps

Sensors and devices

Data

Page 6: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

From data to decisions and actions

Diagnostic[Interactive Dashboards]

Prescriptive[Recommendations & Automation]

Predictive[Machine Learning]

Descriptive[Reports]

What should I do?

What will happen?

Why did it happen?

Whathappened? Insight

Page 7: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

LOB

Applications

SocialDevices

Clickstream

Sensors

Video

Web

Relational

A highly scalable, distributed, parallel file system in the cloud specifically designed to work

with a variety of big data analytics workloads

Azure Data Lake Store

Batch

Map

Reduce

Script

Pig

SQL

Hive

NoSQL

HBase

In-Memory

Spark

Predictive

R Server

Batch

U-SQL

HDInsightADL

Analytics

Page 8: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

关于Azure HDInsight

Page 9: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Analytics

Storage

Microsoft Hadoop Stack

Azure HDInsight

Machine Learning

Local (HDFS) or Cloud (Azure Blob/Azure Data Lake Store)

Page 10: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Azure HDInsight

Fully-managed Hadoop and Spark for the cloud

100% Open Source Hortonworks data platform

Clusters up and running in minutes

Supported by Microsoft with industry’s best SLA

Familiar BI tools for analysis

Open source notebooks for interactive data science

63% lower TCO than deploying Hadoop on-premise*

Hadoop and Spark as a Service on Azure

*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”

Page 11: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

OD

BC

Page 12: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Perimeter Level SecurityVirtual Network

Network Security (i.e. Firewalls)

Gateway Service

Multi-User AuthenticationKerberos

Azure Active Directory

Authorization

using Apache

RangerHive policies

HBase policies

File and Folder level

ACLS on ADLS Data SecurityEncryption @ Rest supported

On both Azure Storage Blob and

ADLS

Page 13: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

HDInsight案例分享

Page 14: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 15: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 16: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

关于HDInsight - Hive

Page 17: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 18: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Platform Core SQL Engine Connectivity

• Ad-Hoc

• Drill-Down

• BI Tools: Tableau,

Excel

• Continuous ingestion

from operational DB

• Slowly changing

dimensions

Legend

Existing

Development

• Multidimensional Analytics

• MDX Tools• Excel

Emerging

Page 19: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 20: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Interactive Hive

cluster (new)SDK, PowerShell

JDBC, ODBC, Visual Studio, Hue, AmbariHadoop cluster

Page 21: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

演示: HDInsight cluster & Hive

Page 22: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

基于HDInsight – Hive的企业

数据仓库

Page 23: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 24: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 25: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 26: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Pay only for time the cluster was actually used

Since data & metadata is persisted, experience is as if the cluster was never deleted

Always on cluster (Persistent) Cluster as a service (On demand)

Storage choice Local HDFS, Azure Blob, Azure

Data Lake Store

Azure Blob, Azure Data Lake Store

Job Scheduling Oozie Azure Data Factory

Data persistence after

cluster deletion

N/A Azure Blob, Azure Data Lake Store

Metadata persistence

after cluster deletion

N/A Azure SQL

Billing Billing for entire time cluster is up Billing per job

Page 27: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 28: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Optimization Summary

Choose from dozens of VMs and scale out capability to increase parallelism

Choose Tez execution Engine

Avoid reading entire partitions by breaking files into pieces

Columnar format supported by Hive which also allows you to use ACID and LLAP

Enables Hive to process 1024 rows at one time to make execution faster

Page 29: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 30: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

演示: Query Authoring Tools演示: 100GB query with Batch

Page 31: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Azure official website https://www.azure.cn/ to official information, solution, documentation, and SDKs for Azure in China

Azure Marketplace in China: https://market.azure.cn/

Azure 1RMB Trial: https://www.azure.cn/pricing/1rmb-trial-full/

Microsoft 云科技公众号 Azure 云助手手机 App

Developer Notes for Azure in China Applications https://www.azure.cn/dev-notes/ to developer differences between Global and China

Page 32: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Azure 中国官网 https://www.azure.cn/ 提供最新产品与解决方案信息, 技术文档,以及SDKs下载

Azure 镜像市场: https://market.azure.cn/

申请一元试用,即刻体验 Azure 服务:https://www.azure.cn/pricing/1rmb-trial-full/

Microsoft 云科技公众号 Azure 云助手手机 App

Azure 应用程序开发说明 https://www.azure.cn/dev-notes/ 概述了海外与中国区服务开发人员需要注意的区别

Page 33: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 34: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 35: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 36: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

•顶级项目

•Apache Kylin, 中国唯一的Apache顶级开源项目,核心开发者及贡献

者都在中国

•行业认可

•连续两年荣获InfoWorld ”最佳开源大数据工具奖”,今年更是与

Google TensorFlow一起获得该奖

•用户认可

•国内外超过100多家大型公司正式使用Kylin作为大数据分析平台解决

方案,分布各个行业

Page 37: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 38: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 39: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Cbe

C

Page 40: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 41: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Kylin的O(1) 算法使得查询性能与数据集大小无关

Page 42: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

超大数据,超高性能,超高并发

Page 43: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

大规模数据分析,无需编码

Page 44: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 45: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 46: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

Azure Resource Manager

Resources Group

Virtual

network

Kylin server

Blob Storage

HDinsight

Page 47: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache

▪ Azure:成熟的云计算平台

▪ HDInsight:自动伸缩

▪ Power BI:自助式可视化BI

▪ Apache Kylin:高性能+高并发+标准SQL

Page 48: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 49: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 50: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 51: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache
Page 52: PowerPoint Presentationdownload.microsoft.com/download/F/3/1/F31FC53E-9216-4921...Hadoop and Spark as a Service on Azure *IDC study “The Business Value and TCO Advantage of Apache