30
大数据和人工智能计算 王绍翾(大沙) 2018.11 2018携程技术峰会 上海

大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

大数据和人工智能计算

王绍翾(大沙)

2018.11

2018携程技术峰会 上海

Page 2: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

博通(Broadcom)

High-Perf Platform

脸书(Facebook)

Social Graph Storage

阿里巴巴

Real-Time Data Infra

北京大学

EECS

美国加州大学圣地亚哥分校

Computer Engineering

负责大数据实时计算平台和算法平台

王绍翾(花名“大沙”)

阿里巴巴资深技术专家

[email protected]

Page 3: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Changing QueryFixed Data

大数据计算的类型

Applications

历史数据 增量数据

异构数据源

数据湖(Data Lake)

数据仓库(Data Warehouse)

清洗后的数据源

批计算

Page 4: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Changing QueryFixed Data

Fixed QueryChanging Data

大数据计算的类型

批计算 流计算

Page 5: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Web Tier

DB Tier MQ

DataHub

Data Pipeline HBase Dashboard

Exactly-Once

Lots of Events

Sub second latency

Highly Available

双十一大屏

Page 6: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Changing QueryFixed Data

Fixed QueryChanging Data

Changing QueryChanging Data

大数据计算的类型

批计算 流计算 交互式分析

Page 7: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

A/B Testing

Page 8: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Impression

Click

Transaction

Logs

Parser

Parser

Parser

Filter

Filter

Filter

Join UDF AGG OLAP

Data Processing PipelineOLAPEngine

A/B Testing

Page 9: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

数据计算

提供完整的数据湖/数仓解决方案

Data Lake/Warehouse Platform

Data Lake/Warehouse Solutions

支持视频流解析&结构化处理赋能城市大脑数据处理

城市大脑 ET City Brain

Support video stream and structuring process

Empower ET City Brain data processing

工业大脑

工业大数据实时监控&预测赋能工业大脑解决方案

ET Industrial Brain

Industrial big data real time monitoring and

prediction

Empower ET Industrial Brain Solutions

IOT

支持云上IOT数据处理实现云端计算一体化

Support IOT data process on Cloud

大数据计算的应用场景

Page 10: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

人工智能计算

Data IngestionData Process and Transformation

ModelTraining

Testing and Validation

Online Serving

Change model or parameters

Transformation Training Predication

Page 11: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Data Engineering

Acquisition

Governance

Data Science / AI

Visualizationand Analysis

Data Process and Transformation

Production

Model Training& Testing

ProductionData Pipelines

Reports,Dashboards

Online Serving

人工智能计算

Change model or parameters

Page 12: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

城市大脑

杭州城市大脑使用阿里云流计算,实时监控交通状况,通过智能调度道路红绿灯、实时更新导航系统,优化城市道路拥堵情况。

Page 13: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high
Page 14: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

大数据智能计算

Business Intelligence

Artificial Intelligence

Big Data Infrastructure

Page 15: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high
Page 16: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Batch

Stream

AI

OLAP

IoT

BigData & AI

能否用一套计算引擎解决所有问题

Page 17: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

流计算引擎

samza Apex

批计算引擎

Apache Flink

统一的大数据智能计算引擎

用Apache Flink构建大数据智能平台

AI计算引擎OLAP分析引擎

Page 18: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Introducing Alibaba Blink

Blink1.0: high performance stream compute engine

Apache FlinkAlibaba’s Improvements

Blink2.0: a new unified high performance compute engine for complete data applications(including batch, stream, AI, and IoT)

阿里巴巴Blink

Page 19: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

性能提升

快速的容错

• Incremental Checkpoint 2016• Fain-grained Recovery 2016• Barrier Alignment Improvement 2017

• Async Operator 2016• Credit Based Flow Control 2017• Load Auto Balance 2017

天然的Flink纯流式计算的低延迟

大规模高并发部署的优化 2016

性能提升

资源配置自动化 2018

• 大量的Query Optimization 2017-2018

主导制定 Flink SQL 语义

完善 Flink SQL 功能

• Dynamic Table 2016-2017• Retraction 2016-2017

• Aggregation,Join,Window 2017• 跑通全部TPCH/TPCDS Query 2018

贡献社区

贡献社区

部分贡献社区

部分贡献社区

贡献社区

贡献社区

Blink1.0

Page 20: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

API的统一: same query,same results

Query Processing的统一: unified query optimization and query execution framework

Blink2.0: unified compute engine for complete data applications

应用的统一: switch between batch processing and streaming seamlessly

Page 21: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

real-time

return one final result

correctness

emit results as early as possible

批计算和流计算API的统一

Batch Processing Stream Processing

VS

in stream processing, it emits intermediate results, and keeps refining the results to ensure correctness

VS

Page 22: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

WHAT & HOW: results are calculated

WHEN: to emit a (intermedia) result

HOW: to refine the results

ANSI SQL can Describe Stream Processing

可以用SQL统一流和批的计算

Can be fully described by SQL

Does not affect business logic

Can be solved by SQL engine

批计算和流计算API的统一

Page 23: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Runtime

DAG API & Operators

Query Processor

Query Optimizer & Query Executor

SQL & Table API

Relational

Local

Single JVM

Cloud

GCE, EC2

Cluster

Standalone, YARN

SQL

& TableAPI

Logical

PlanPhysical

Plan

Execution

DAG

completely same between batch & stream processing

Optimizer

stream processing has some unique design

Same Results

Batch mode

Same SQL Query

Stream Mode

Blink2.0 统一了SQL Engine的QP

Page 24: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

CustomizableJob Scheduling

QueryOptimizations

ExecutionOptimizations

Batch FriendlyFailover

VariousShuffle Service

Optimizations for Batch Processing in Blink2.0

Page 25: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

61.1

476.3

29.5 26.2

63.5

13.0

57.886.1

135.6

43.1

135.6

29.255.4

25.037.3 39.7

117.5

199.5

69.043.6

589.1

39.415.3 8.3 14.1 10.1 18.1

3.1 11.0 10.9 19.9 12.3 3.3 8.4 8.8 5.0 5.5 5.4 9.7 20.3 6.4 6.6

27.4 5.8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

1T数据量 TPCH性能对比

Flink 1.6.0 Blink 2.0

2372s vs 236s > 10X

Optimizations for Batch Processing in Blink2.0

Page 26: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Next Steps

Grand Unification of Data Processing

Flink Machine Learning/AI

Switch between batch processing and streaming

seamlessly

PyFlink, TableAPI, TensorFlow, Julia, Flink ML

Improvements

EcosystemHive, Zeppelin, Jupyter, Livy

Page 27: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

TensorFlow on Flink

Data IngestionData Process and Transformation

ModelTraining

Testing and Validation

Deployment

Change model or parameters

Page 28: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Flink on Zeppelin

Page 29: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

Flink Forward China – 北京 –12.20

Page 30: 大数据和人工智能计算 - CtripIntroducing Alibaba Blink Blink1.0: high performance stream compute engine Apache Flink Alibaba’s Improvements Blink2.0: a new unified high

本PPT来自2018携程技术峰会

更多技术干货,请关注“携程技术中心”微信公众号