44
1 Core-Stateless Fair Queueing: Core-Stateless Fair Queueing: Achieving Approximately Fair Achieving Approximately Fair Bandwidth Allocations in High Bandwidth Allocations in High Speed Networks Speed Networks Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999 Presented by Bob Kinicki

Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Embed Size (px)

DESCRIPTION

Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks. Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999 Presented by Bob Kinicki. Outline. Introduction CSFQ Architecture Flow Arrival Rate - PowerPoint PPT Presentation

Citation preview

Page 1: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

1

Core-Stateless Fair Queueing: Achieving Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations Approximately Fair Bandwidth Allocations in High Speed Networksin High Speed Networks

Ion Stoica,Scott Shenker, and Hui Zhang

SIGCOMM’99, Vancouver, August 1999

Presented by Bob Kinicki

Page 2: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ2

OutlineOutline

IntroductionCSFQ Architecture

– Flow Arrival Rate– Link Fair Share Rate Estimation

Many SimulationsConclusions

Page 3: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ3

IntroductionIntroduction This paper brings forward the concept of “fair”

allocation. The claim is that fair allocation inherently

requires routers to maintain state and perform operations on a per flow basis.

They present an architecture and a set of algorithms that is “approximately fair” while using FIFO queueing at internal routers.

Page 4: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ4

An “Island” of RoutersAn “Island” of Routers

EdgeRouter

CoreRouter

SourceDestination

Destination

Page 5: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ5

Core-Stateless Fair QueueingCore-Stateless Fair Queueing

Ingress edge routers compute per-flow rate estimates and insert these estimates as labels into each packet header.

Labels are updated at each router based only on aggregate information.

FIFO queueing with probabilistic dropping of packets on input is employed at core routers.

Page 6: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ6

Edge – Core Router Edge – Core Router ArchitectureArchitecture

Page 7: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ7

Model Variables and ParametersModel Variables and Parameters

Router with output link capacity C.

ri (t) :: ith flows arrival rate

α(t) :: fair share rate

In the fluid flow model, incoming bits of flow i at a core router are dropped with probability

Max (0, 1 - α(t) / ri (t) )

Page 8: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ8

Flow Arrival RateFlow Arrival RateAt each edge router, use exponential averaging to estimate the rate

of a flow. For flow i, let

lik be the length of the kth packet.

tik be the arrival time of the kth packet.

Then the estimated rate of flow i, ri is updated every time a new packet is received:

rinew = (1-e-T/K)*L / T + e-T/K* ri

old

where

T = Tik = ti

k – tik-1

L = lik and K is a constant

Page 9: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ9

Link Fair Rate EstimationLink Fair Rate Estimation

Heuristic algorithm with aggregate state variables:

alpha_hat :: fair share estimate

Lambda_hat :: aggregate arrival rate estimate

F_hat :: aggregate acceptance rate estimate

Page 10: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ10

Algorithm IdeaAlgorithm Idea

When packet arrives, Lambda_hat is updated using exponential averaging.

If the packet is dropped, F_hat remains the same.

If the packet is not dropped, F_hat is updated using exponential averaging.

Whenever “epoch” switches congested state, update alpha_hat:

alpha_hatnew = alpha_hatold * C /F_hat

Page 11: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ11

Algorithm Idea (cont.)Algorithm Idea (cont.)

Alpha_hat now feeds into next calculation of drop probability (p) as alpha:

p = max ( 0 , 1 – alpha / label )

Page 12: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ12

CSFQ Pseudo -CodeCSFQ Pseudo -Code

Page 13: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ13

Label RewritingLabel Rewriting

At core routers, outgoing rate is merely the minimum between the incoming rate and the fair rate, α .

Hence, the packet label L can be rewritten by

L new = min (L old , α )

Page 14: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ14

SimulationsSimulations

Major effort of the paper is to compare CSFQ to four algorithms via ns-2 simulations.

FIFOREDFRED (Flow Random Early Drop)DRR (Deficit Round Robin)

Page 15: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ15

FRED (Flow Random Early Drop)FRED (Flow Random Early Drop)

Maintains per flow state in router. FRED preferentially drops a packet that has

either:– Had many packets dropped in the past– A queue larger than the average queue size

Main goal : Fairness FRED-2 guarantees to each flow a minimum

number of buffers.

Page 16: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ16

DRR (Deficit Round Robin)DRR (Deficit Round Robin)Represents an efficient implementation of

WFQ.A sophisticated per-flow queueing

algorithm.Scheme assumes that when router buffer is

full the packet from the longest queue is dropped.

Can be viewed as “best case” algorithm with respect to fairness.

Page 17: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ17

Simulation DetailsSimulation DetailsUse TCP, UDP, RLM and On-Off traffic

sources in separate simulations.Bottleneck link: 10 Mbps, 1ms latency,

64KB bufferRED, FRED min max thresholds: 16KB,

32KBConstants (K, , Kc ) : all 100 ms.Kα

Page 18: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ18

A Single Congested LinkA Single Congested Link

First Experiment : 32 UDP flows– Each UDP flow is indexed from 0 to 31 with

flow 0 sending at 0.3125 Mbps and each of the i subsequent flows sending (i+ 1) times its fair share of 0.3125 Mbps.

Second Experiment : 1 UDP flow, 31 TCP flows– UDP flow sends at 10 Mbps– 31 TCP flows share a single 10 Mbps link.

Page 19: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ19

Figure 3a: 32 UDP Flows Figure 3a: 32 UDP Flows

Only CSFQ andDRR can containUDP flows!!

Page 20: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ20

Figure 3b : One UDP Flow, 31 TCP FlowsFigure 3b : One UDP Flow, 31 TCP Flows

Only CSFQ andDRR can containFlow 0 – the onlyUDP flow!

Page 21: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ21

A Single Congested LinkA Single Congested Link

Third Experiment Set : 31 simulations– Each simulation has a different N,

N = 2 … 32.– One TCP and N-1 UDP flows with each

UDP flow sending at twice fair share rate of 10/N Mbps.

Page 22: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ22

Figure 4 : One TCP Flow, N-1 UDP FlowsFigure 4 : One TCP Flow, N-1 UDP Flows

Normalized fair share throughput for TCP source

DRR good for lessthan 22 flows.

CSFQ better thanDRR when a largenumber of flows.

CSFQ beats FRED.

Page 23: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ23

Multiple Congested LinksMultiple Congested Links

Router RouterRouter Router

UDPSinks

TCP/UDP-0Sink

TCP/UDP-0Source

UDPSources

1

1-10 K1-K10

10 11 20 K10K1

Page 24: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ24

Figure 6a : UDP sourceFigure 6a : UDP source

Fraction of UDP-0 traffic forwardedversus the number of congested links.

Page 25: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ25

Figure 6b : TCP SourceFigure 6b : TCP Source

Fraction of TCP-0 traffic forwardedversus the number of congested links.

Page 26: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ26

Receiver-driven Layered MulticastReceiver-driven Layered Multicast

RLM is an adaptive scheme in which the source sends the information encoded in a number of layers.

Each layer represents a diferent multicast group.

Receivers join and leave multicast groups based on packet drop rates experienced.

Page 27: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ27

Receiver-driven Layered MulticastReceiver-driven Layered Multicast

Simulation of three RLM flows and one TCP flow.

Fair share for each is 1 Mbps.Since router buffer set to 64 KB, K, Kc,

and are set to 250 ms.Kα

Page 28: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ28

Figure 7a : DRRFigure 7a : DRR

Page 29: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ29

Figure 7b : CSFQFigure 7b : CSFQ

Page 30: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ30

Figure 7c : FREDFigure 7c : FRED

Page 31: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ31

Figure 7d : REDFigure 7d : RED

Page 32: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ32

Figure 7e : FIFOFigure 7e : FIFO

Page 33: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ33

On-Off Flow ModelOn-Off Flow Model

One approach to modeling interactive, Web traffic :: OFF represents “think time”

ON and OFF drawn from exponential distribution with means of 100 ms and 1900 ms respectively.

During ON period source sends at 10 Mbps.

Page 34: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ34

Table 1 : One On-Off Flow, 19 TCP FlowsTable 1 : One On-Off Flow, 19 TCP Flows

Algorithm Delivered Dropped

DRR 601 6157

CSFQ 1680 5078

FRED 1714 5044

RED 5322 1436

FIFO 5452 1306

Page 35: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ35

Web TrafficWeb Traffic

A second approach to modeling Web traffic that uses Pareto Distribution to model the length of a TCP connection.

In this simulation 60 TCP flows whose inter-arrivals are exponentially distributed with mean 0.05 ms and Pareto distribution that yields a mean connection length of 20,1 KB packets.

Page 36: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ36

Table 2 : 60 Short TCP Flows, One UDP FlowTable 2 : 60 Short TCP Flows, One UDP Flow

Algorithm Mean Transfer Time for TCP

Standard Deviation

DRR 25 99

CSFQ 62 142

FRED 40 174

RED 592 1274

FIFO 840 1695

Page 37: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ37

Table 3 :Table 3 : 19 TCP Flows, One UDP Flow 19 TCP Flows, One UDP Flow with propagation delay of 100 ms.with propagation delay of 100 ms.Algorithm Mean

ThroughputStandard Deviation

DRR 6080 64

CSFQ 5761 220

FRED 4974 190

RED 628 80

FIFO 378 69

Page 38: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ38

Packet RelabelingPacket Relabeling

RouterFlow 2

Router

Sink

Flow 1

Sources

Flow 3

Link 210 Mbps

Link 110 Mbps

10 Mbps

10 Mbps

10 Mbps

Page 39: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ39

Table 4 : UDP and TCP with Table 4 : UDP and TCP with Packet Relabeling Packet Relabeling

Traffic Flow 1 Flow 2 Flow 3

UDP 3.36 3.32 3.28

TCP 3.43 3.13 3.43

Page 40: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ40

Unfriendly FlowsUnfriendly FlowsUsing TCP congestion control requires

cooperation from other flows.Three types cooperation violators:

– Unresponsive flows (e.g., Real Audio)– Not TCP-friendly flows– Flows that lie to cheat.

This paper deals with unfriendly flows!!

Page 41: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ41

ConclusionsConclusionsThis paper presents Core Stateless Fair

Queueing and offers many simulations to show how CSFQ provides better fairness than RED or FIFO.

They mention issue of “large latencies”. This is the robust versus fragile flow issue from FRED paper.

CSFQ ‘clobbers’ UDP flows!

Page 42: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ42

SignificanceSignificance

First paper to use hints from the edge of the subnet.

Deals with UDP. Many algorithms do not.

Makes a reasonable attempt to look at a variety of traffic types.

Page 43: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ43

Problems/ WeaknessesProblems/ Weaknesses

“Epoch” is related to three constants in a way that can produce different results.

How does one set K constants for a variety of situations.

No discussion of algorithm “stability”

Page 44: Ion Stoica,Scott Shenker, and Hui Zhang SIGCOMM’99, Vancouver, August 1999

Advanced Computer Networks :

CSFQ44

AcknowledgmentsAcknowledgments

Figures extracted from presentation by Nagaraj Shirali and Choong-Soo Lee in Spring 2002.