51
Keystone Event Processing Pipeline Zhenzhong Xu on a Dockerized Microservices Architecture

Keystone event processing pipeline on a dockerized microservices architecture

Embed Size (px)

Citation preview

Page 1: Keystone event processing pipeline on a dockerized microservices architecture

Keystone Event Processing Pipeline

Zhenzhong Xu

on a Dockerized Microservices Architecture

Page 2: Keystone event processing pipeline on a dockerized microservices architecture

Real-time Data Infrastructure – Netflix

Cloud Infrastructure - Microsoft

About Me

Page 3: Keystone event processing pipeline on a dockerized microservices architecture

About Netflix

● 83M+ Subscribers

● 125M+ Streaming Hours / Day

● > 1/3 Peak NA Internet Traffic

● Thousands of Device Types

● Many Tens of Thousands of VMs

● 3 Active-Active Regions Across the World

Page 4: Keystone event processing pipeline on a dockerized microservices architecture

Observe

Orient

Decide

Act CD

Page 5: Keystone event processing pipeline on a dockerized microservices architecture

Observe

Orient

Decide

Act

Innovation

CD

Page 6: Keystone event processing pipeline on a dockerized microservices architecture

Observe

Orient

Decide

Act

Innovation

Big Data

CD

Page 7: Keystone event processing pipeline on a dockerized microservices architecture

Observe

Orient

Decide

Act

Innovation

Big Data

Culture

CD

Page 8: Keystone event processing pipeline on a dockerized microservices architecture

Observe

Orient

Decide

Act

Innovation

Big Data

Culture

Cloud

CD

Page 9: Keystone event processing pipeline on a dockerized microservices architecture

Microservices Ecosystem

Page 10: Keystone event processing pipeline on a dockerized microservices architecture

Why a Event Processing Platform in Netflix?

● 500+ Billion events generated per day

● 1+T events processed per day

○ >1 PB

○ 4M – 16M / sec

○ 13GB - 43GB /sec

● Message Payload: 3 kb - 10mb

Page 11: Keystone event processing pipeline on a dockerized microservices architecture

Data Driven Culture

● Realtime System Failure Detection

● A/B Testing

● Recommendation Algorithm

● Fraud Detection

● Distributed Tracing

● Log Quering

Page 12: Keystone event processing pipeline on a dockerized microservices architecture

Paved Road in a Microservices Ecosystem

Microservices produces events

Storage service, and Batch/Stream

Processing services

Event Processing

Pipeline

Page 13: Keystone event processing pipeline on a dockerized microservices architecture

Paved Road in a Microservices Ecosystem

Page 14: Keystone event processing pipeline on a dockerized microservices architecture

Supports Batch & Streaming

Page 15: Keystone event processing pipeline on a dockerized microservices architecture

Evolution of Netflix Keystone Pipeline

Page 16: Keystone event processing pipeline on a dockerized microservices architecture

In the Old Days ...

EMR

EventProducers

Page 17: Keystone event processing pipeline on a dockerized microservices architecture

About a year ago

EventProducer

Druid

Stream Consumers

EMR

ConsumerKafka

Suro Router

EventProducer

Suro

Kafka

SuroProxy

Page 18: Keystone event processing pipeline on a dockerized microservices architecture

Today

Stream Consumers

SamzaRouter

EMR

FrontingKafka

ConsumerKafka

Control Plane

EventProducer

KS P

roxy

Self Service UI

Page 19: Keystone event processing pipeline on a dockerized microservices architecture

Event flowKeystone Pipeline As a Service

Page 20: Keystone event processing pipeline on a dockerized microservices architecture

Stream Consumers

SamzaRouter

EMR

FrontingKafka

EventProducer

ConsumerKafka

Control Plane

Self Service UI

Page 21: Keystone event processing pipeline on a dockerized microservices architecture

Stream Consumers

SamzaRouter

EMR

FrontingKafka

EventProducer

ConsumerKafka

Control Plane

Self Service UI

Page 22: Keystone event processing pipeline on a dockerized microservices architecture

Stream Consumers

SamzaRouter

EMR

FrontingKafka

EventProducer

ConsumerKafka

Control Plane

Self Service UI

Page 23: Keystone event processing pipeline on a dockerized microservices architecture

Stream Consumers

SamzaRouter

EMR

FrontingKafka

EventProducer

ConsumerKafka

Control Plane

Self Service UI

Page 24: Keystone event processing pipeline on a dockerized microservices architecture

Stream Consumers

SamzaRouter

EMR

FrontingKafka

EventProducer

ConsumerKafka

Control Plane

Self Service UI

Page 25: Keystone event processing pipeline on a dockerized microservices architecture

What exactly is Keystone?

Page 26: Keystone event processing pipeline on a dockerized microservices architecture

Keystone is ...

… a collection of microservices & components

Stream Processing

ServiceElastic

Pub/Sub Queue

Producer API

Control Plane

Consumer API

Self Service UI

Page 27: Keystone event processing pipeline on a dockerized microservices architecture

Keystone is ...… a single self-contained logical service

Event Processing

Pipeline

Page 28: Keystone event processing pipeline on a dockerized microservices architecture

Keystone is ...… an self-scaling, multi-tenancy service that embraces CI/CD

Page 29: Keystone event processing pipeline on a dockerized microservices architecture

Keystone is ...… a self healing, cloud failure tolerant service, guarantees at-least-once delivery semantics

Page 30: Keystone event processing pipeline on a dockerized microservices architecture

Let’s drill down ...

For the purposes of this talk, we’ll focus on...

Stream Processing

Service

Elastic Pub/Sub Queue

Producer SDK

Control Plane

Consumer SDK

Self Service UI

Page 31: Keystone event processing pipeline on a dockerized microservices architecture

Overview

Page 32: Keystone event processing pipeline on a dockerized microservices architecture

Self Service UI

Page 33: Keystone event processing pipeline on a dockerized microservices architecture

Routing Infrastructure

Page 34: Keystone event processing pipeline on a dockerized microservices architecture

Routing Infrastructure

EC2 InstancesZookeeper(Instance Id assignment)

JobJob

Job

ksnode

Checkpointing Cluster

Server Group (Cluster)Store logs

in S3

Page 35: Keystone event processing pipeline on a dockerized microservices architecture

Routing Infrastructure

+

CheckpointingCluster

+

0.9.1Go

C language

Page 36: Keystone event processing pipeline on a dockerized microservices architecture

Control Plane

Custom Cluster Orchestration and Scheduling Layer

Page 37: Keystone event processing pipeline on a dockerized microservices architecture

Control Plane

• Decides container resources

• Schedules container placements

• Orchestrates cluster deployments

Page 38: Keystone event processing pipeline on a dockerized microservices architecture

Design Decisions?Distributed System is all about

trade-offs.

Page 39: Keystone event processing pipeline on a dockerized microservices architecture

Container

● Process Isolation● Fast Startup

Page 40: Keystone event processing pipeline on a dockerized microservices architecture

Service Protocol

● Declarative

● Idempotent

● Reconciliation

Page 41: Keystone event processing pipeline on a dockerized microservices architecture

State Management

● Stateless vs Stateful service

● Single source of truth

Page 42: Keystone event processing pipeline on a dockerized microservices architecture

Scaling

● Self Scaling

● Partition boundary

● Idempotent operations

● Immutable server deployments

Page 43: Keystone event processing pipeline on a dockerized microservices architecture

Delivery Semantics

● At-most-once

● At-least-once (best effort)

● Exactly-once

Page 44: Keystone event processing pipeline on a dockerized microservices architecture

At-least-once under failure condition

● Checkpointing mechanism

● Optimize for writes

● Occasional reads

Page 45: Keystone event processing pipeline on a dockerized microservices architecture

Multi-tenancy

● Isolation

● Heterogenous

● Cluster fragmentation

Page 46: Keystone event processing pipeline on a dockerized microservices architecture

Failure Recovery

• Back pressure

• Network blip

• Container level failure

• Instance level failure

• Zone level failure

• Cluster level failure - Kafka-Kong

• Regional failure - Chaos-Kong

Page 47: Keystone event processing pipeline on a dockerized microservices architecture

Stream Processing Engine

• Discovery integration

• Custom wire format integration

• Samza: Per partition serialized process loop

• Samza: Simple payload transformation

• Plugable abstraction

Page 48: Keystone event processing pipeline on a dockerized microservices architecture

Current Scale - Routing Service

● 14,000 + docker containers

● 1,400 + EC2 C3-4XL instances

● 3 regions

Page 49: Keystone event processing pipeline on a dockerized microservices architecture

Future Improvements

● Integrate with more sophisticated orchestration/scheduling/cluster management ecosystem

● Unlock value in real-time unbounded data streams

● Data Discovery● Data Silos

Page 50: Keystone event processing pipeline on a dockerized microservices architecture

Questions?

Page 51: Keystone event processing pipeline on a dockerized microservices architecture