24

UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena
Page 2: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

UNICORNとは?

UNICORNの課題 UNICORNの事例

Page 3: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

UNICORNとは?

Page 4: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

アプリデベロッパー向けの

Automated Marketing Platformです

Page 5: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

UNICORNの課題

Page 6: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

UNICORNはRTBでの広告買い付けに対応しています。

ユーザー

サイト

UNICORN

② ④

⑤ RTB

① ユーザーがサイトに訪問

②広告買い付けリクエストを受け

③最適な広告を選ぶ

④買う単価を決める

⑤他のDSPより単価が高い場合に広告が表示される

秒間20万Request

一万種類以上の広告在庫

10ms前後100+の要素

機械学習の精度

Page 7: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture

Amazon EC2 Elastic Load Balancing (ELB) Fluentd

Amazon S3

Kafka

Amazon Aurora MongoDB

Amazon Athena

Amazon Redshift

Flink

Amazon Glue Machine Learning

Real Time

AnalyticsStorage

Database

Page 8: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture

Max 200,000 QPSReal Time

PB Size DataStorage

Fast and Cheaper Analytics

Variety Database

Page 9: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

UNICORNの事例

Page 10: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture

S3を活用してHttpRequestの処理スピードを10ms台まで短縮しました!

Page 11: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture - Real Time

Request

Program

Cache

Response

DatabaseNew Request Connections

Maintenance

Page 12: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture - Real Time

Shared Memory

Response

Request

Program

S3 DatabaseFast High Availability Maintain Easy10ms

100ms → 10ms !

Page 13: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture

S3とAthenaの利用で機械学習の高速化に成功!!!

Page 14: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture - Storage

Aurora Redshift MongoDB

Driver

Hive

Report Aggregation Machine Learning

HDFS

Hard AutoScale

Wait slowest

Page 15: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture - Storage

Aurora Redshift MongoDB

Glue

Athena

Report Aggregation Machine Learning

S3Read Fast

Async

Page 16: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture - Analytics

Amazon EC2 Elastic Load Balancing (ELB) Fluentd

Amazon S3

Kafka

Amazon Aurora MongoDB

Amazon Athena

Amazon Redshift

Flink

Amazon Glue Machine Learning

Analytics40(0)TB / Day

10,000 Jobs / Day

Page 17: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture - Analytics

Parquet

Page 18: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture - Analytics

Athena Cost

90% Down!

Page 19: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

Architecture - Analytics

S3 Cost

x 10!

Page 20: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

JSON And Parquet対象5ファイルに対してAthenaのSelectを実行します。

SELECT col_1 FROM table

JSONの場合Athena Scan Data: 5GB * 5 = 25GBS3 Request Count: 5

Parquetの場合Athena Scan Data: 5GB * 5 / 10 = 2.5GBS3 Request Count: 5

Parquet使うとAthena Costが1/10に!

※ 5GB / file,10 columns / row

Page 21: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

JSON And Parquet対象5ファイルに対してAthenaのSelectを実行します。

SELECT col_1 … col_10 FROM table

JSONの場合Athena Scan Data: 5GB * 5 = 25GBS3 Request Count: 5

Parquetの場合Athena Scan Data: 5GB * 5 = 25GBS3 Request Count: 5 * 10 = 50

※ 5GB / file,10 columns / row

Parquetのファイル数を考慮しないとS3コスト増!

Page 22: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

JSON And Parquet

Parquetを使用する時にS3のアクセス回数も考慮する必要があります!!!

Cost = Athena Scan Data + S3 Requests Count解決策:Parquet生成する時に , 1パーティション1ファイルに固める

Page 23: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

チーム構成:

Engineer 5人

BizDev 3人

AWS EC2 400+台

ほぼ全日本のモバイル端末からの広告トラフィックを処理できています。

これからGlobal展開していきます。QPSまだまだ上がります!

[email protected]

Page 24: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena

ご清聴ありがとうございました