21
Hitesh Ballani Daniel Cletheroe, Paolo Costa, Istvan Haller, Krzysztof Jozwik, Fotini Karinou, Sophie Lange, Kai Shi, Benn Thomsen Microsoft Research

Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Hitesh Ballani

Daniel Cletheroe, Paolo Costa, Istvan Haller, Krzysztof Jozwik, Fotini Karinou, Sophie Lange, Kai Shi, Benn Thomsen

Microsoft Research

Page 2: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Will free scaling of the network continue?

2

Sw

itch

perf. p

er

do

llar

Gb

ps/

$

Years

Page 3: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Requirements for data center switching

1. Ultra-fast

- nanosecond switching

Bursty cloud applications

- 90% packets less than 576 bytes

Page 4: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Requirements for data center switching

1. Ultra-fast

- nanosecond switching

2. Scalable (ideally, high-radix)

- flat network

3. Reliable and easy to manage

Flatter network

Page 5: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

5

Could photonics offer a new growth curve?

Page 6: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Potential benefits

Low and predictable

latency

Clo

ud

netw

ork

co

st

Time

Electrical

Optical

Page 7: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Why the hold-up?Sca

le o

f so

lutio

n

Switching speed

Slow (ms) Fast (ns)

Cluster scale

(10s-100s)

Data center scale

(1,000s-10,000s)

Maintaining two separate networks

- Too much complexity

EAMs

MZIs

SOAs

Tunable lasers and AWGRs

What is the

last mile so hard?

Page 8: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Bridging the last mile

Apps

Network Stack

NIC

PHY

Apps

Network Stack

NIC

PHY

A fundamentally different abstraction

From asynchronous packet switches to synchronous circuit switches

Optical Circuit Switch

Page 9: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Lots of very hard problems to solve to make

optical switching practical

Need for cloud-centric, cross-layer solutions

Page 10: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Two case studies

Problem Traditional solution Cloud-centric solution

Lack of buffering Centralized Scheduler Scheduler-less network

Sub-nanosecond CDR - Phase caching

10

Page 11: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Case study #1: Scheduling the network

11

1 2 3 N

1011… 1011…

Scheduler

High-radix optical switchNo buffers

Building a data center wide scheduler is hard

Scale to 100K servers, Infer demand, Communicate demand

Page 12: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Scheduler-less network [Cheng et al., 2000]

A B C D E F G H

1 2 3 4 5 6 7Time slot

A

B

C

DE

F

G

H

BC

DEF

G

H

A

CD

EFG

H

A

B

DE

FGH

A

B

C

EF

GHA

B

C

D

FG

HAB

C

D

E

GH

ABC

D

E

F

HA

BCD

E

F

G

(a cyclic permutation)

A permutation

of connections N-1 time slots

Uniform traffic 100% throughputStatic pre-defined schedule

Page 13: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Scheduler-less network [Cheng et al., 2000]

A B C D E F G H

1 2 3 4 5 6 7Time slot

A

B

C

DE

F

G

H

BC

DEF

G

H

A

CD

EFG

H

A

B

DE

FGH

A

B

C

EF

GHA

B

C

D

FG

HAB

C

D

E

GH

ABC

D

E

F

HA

BCD

E

F

GUniform traffic

Arbitrary traffic pattern

Static pre-defined schedule

Load Balancing

Page 14: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Scheduler-less network [Cheng et al., 2000]

A B C D E F G H

1 2 3 4 5 6 7Time slot

A

B

C

DE

F

G

H

BC

DEF

G

H

A

CD

EFG

H

A

B

DE

FGH

A

B

C

EF

GHA

B

C

D

FG

HAB

C

D

E

GH

ABC

D

E

F

HA

BCD

E

F

GUniform traffic

Arbitrary traffic pattern

50% throughput

in worst-caseStatic pre-defined schedule

Load Balancing

Page 15: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Good trade-off

- Through and latency overhead of 2-hop paths

+ No scheduling!

“Shoal: A Network Architecture for Disaggregated Racks”, Cornell and Microsoft Research, NSDI 2019

“RotorNet: A Scalable, Low-complexity, Optical Datacenter Network”, UCSD, SIGCOMM 2017

Page 16: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Case study#2: Clock and Data Recovery

Today’s network Optical network

16

• Links are point to point and always on− CDR Locking time does not matter

• Links can be established every few ns− Locking time impacts overall throughput

High-radix optical switch

Transient links

Page 17: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Status quo on CDR

Commercial

State-of-the-art*

>100 ns

CDR LockingTime

256-ByteData

20 ns

100 Gbps (4x25)

Optical Switching

1 ns

8 ns 20 ns

100 Gbps (4x25)

1 ns

High locking time impacts network throughput

69% txput

* A Cevrero et al., IBM Research and EPFL, OFC 2018

Page 18: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Need for sub-nanosecond CDR

Page 19: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Phase caching: sub-nanosecond CDR

Phase does not change often, cache it!

Synchronous

Tx RxBut Optical Switches requiresynchronisation to avoid packet collisions!

CDR problem reduced to phase discovery

But it can still take >40ns to recover phase

Page 20: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Fast optical switching starts looking viable

20

Cross-layer approach that

leverages DC peculiarities to

achieve sub-ns CDR

Good trade-off in this setting

20 ns

4×25 Gbps

1 ns

0sssssss625 ps

“Sub-Nanosecond Clock and Data Recovery in an Optically-Switched Data Centre Network”,

UCL and Microsoft Research, ECOC PDP, 2018

Page 21: Microsoft Research · Fast optical switching starts looking viable 20 Cross-layer approach that leverages DC peculiarities to achieve sub-ns CDR Good trade-off in this setting 20

Innovation across the cloud stack needed

Physical

Network

Stack

New optics

New hardware/software

New protocols

Architecture New algorithms

Need an end-to-end, cloud-centric approach to

make optical switching viable

http://aka.ms/OpticsForTheCloud