Upload
jason-hou
View
2.036
Download
3
Embed Size (px)
DESCRIPTION
Internet Research Lab at NTU, Taiwan. A survey of Valiant Load Balancing adopted in data center networks Virtual Layer 2 (VL2).
Citation preview
Theoretical Foundation for Valiant Load Balancing and Traffic Oblivious
Routing
侯宗成 , Oct. 13th, 2011
• A. Greenberg et al., “VL2: A Scalable and Flexible Data Center Network”, ACM SIGCOMM 2009.
• M. Kodialam, T. V. Kakshman, S. Sengupta, “Efficient and Robust Routing of Highly Variable Traffic”, HotHets, 2004.
• James Roberts, “Public Reviews of Papers Appearing at HotNets-III”, ”, HotHets, 2004.
Outline
• Valiant Load Balancing in VL2
• Background
• Proposed Routing Scheme
• Further Ideas
Outline• Valiant Load Balancing in VL2
– Goals and Building Blocks of VL2– Spreading for Uniform High Capacity– Randomization for Volatility– References of VLB
• Background• Proposed Routing Scheme• Further Ideas
Goals and Building Blocks of VL2• Current designs prevent agility
– Poor server-server capacity: Oversubscription– Poor utilization: Fragmentation of resources – Poor reliability: Routing & computing
deadlocks
• Goals: Scalable, flexible, and agile DC– Uniform High Capacity– Performance Isolation– Layer-2 Semantics
Goals and Building Blocks of VL2
• Supporting Infrastructure– Directory System / Address Mapping
• Key Innovation– Application and Location Addresses
• Major Application of an Innovation– VLB and ECMP
• Infrastructure– Clos Topology
Goals and Building Blocks of VL2
Supporting
GoalsBuilding Blocks
Innovation
Application
Infrastructure
Spreading for Uniform High Capacity
• ECMP: among equal paths for a node• VLB: among nodes for entire network• Implement VLB by spreading traffic to
bounce off several core switches• Hot-spot free: encapsulation and anycast
address of core switches• No centralized engineering
– Seemingly contradictory to OpenFlow– Discuss in further ideas
Randomization for Volatility• Destination-independent traffic spreading• Randomly-chosen intermediate switches• Traffic spreading ratios are uniform• Edge constraints hold
– theoretical model provide later
• Shim layer agent: enables path control by adjusting randomization
• Claims no problem when elephant flows occur: where OpenFlow can work on
References of VLB• Specific example, VLB:
– R. Zhang-Shen and N. McKeown “Designing a Predictable Internet Backbone Network”, HotNets-III, November 2004.
• General Case, Traffic Oblivious Routing: – M. Kodialam, T.V. Lakshman, S Sengupta, “Efficient and Robust
Routing of Highly Variable Traffic," HotNets, 2004.
• Both met at Stanford Workshop on Load-Balancing, May 2004.
• R. Zhang-Shen: student of McKeown(Ph.D.) and Roxford (post-doc), now at Google.
• Sengupta: one of the authors of VL2, now at Microsoft Research
• Early works by: Valiant, for processor interconnection networks, 1981.
Outline• Valiant Load Balancing in VL2• Background
– Original Motivation in 2004– Traditional Approach– Multi-Commodity Flow Problem– Preferred Routing Characteristics– Similarities with Data Center Network
• Proposed Routing Scheme• Further Ideas
Original Motivation in 2004
• For Internet Backbone, ISP, VPN services, and Autonomous Systems.
• Also applicable to any scenarios:– Extreme traffic variations– Traffic matrix unknown and no pattern
• Didn’t think of applying to DCN.• Found to be so ideal for DCN in VL2.
Traditional approach
• Assume we know matrix of demands of pairs of ingress/egress routers
• Network design can be formulated as a multi-commodity flow problem
• Routing and capacity be selected to:– optimize objective functions– while satisfying constraints.
• For example: IP shortest path routing – implies that demands are over a single path
satisfying least hops or delay.
Multi-Commodity Flow Problem
Multi-Commodity Flow Problem
Traditional approach• Drawbacks
– Neglect volatile and variant nature of the Internet.– Lack of diversity in routing choices and long
convergence times in case of failures.
• Issues:– Traffic variant, volatile, and unpredictable– Need routing without dynamic adjustments to
avoid instability and risk of failures.
• Abandoned complex traffic prediction and management.
Lead to overprovisioning.
Preferred Routing Characteristics• Can handle unpredictable traffic and
maintains good service
• Minimize overprovisioning
• Mostly static routing, without dynamic adjustments and complex mechanisms
Similarities with Data Center Network
• Traffic unpredictable and variant
• Mostly static routing can release workload
• Bandwidth on links are critical resources
• DCN core works similarly as backbone network
Outline• Valiant Load Balancing in VL2• Background• Proposed Routing Scheme
– Briefing– Modeling Traffic Variability– Traffic Oblivious Routing– Capacity Effectiveness– Key Knowledge Gained
• Further Ideas
Briefing• View Internet backbone as fully meshed
• N nodes with inter-node links by tunneling
• Traffic Ti-j is routed through an intermediate k: tunnel i→k→j• Traffic split over all possible two-hop
routes
• Including i→i→j and i→j→j
Briefing• Can be performed at flow level by a hash
function or by resequencing packets
• Tunnels need to be sized to accommodate all possible traffic matrices
• The only constraint: an upper bound on the total amount of incoming and outgoing capacity at each node.
Modeling Traffic Variability
Modeling Traffic Variability
Modeling Traffic Variability
Modeling Traffic Variability
Modeling Traffic Variability
A very tough condition, all nodes are at Ri Ci full capacity.
Modeling Traffic Variability
Modeling Traffic Variability
• A very tough condition, all nodes are at Ri Ci full capacity.
• It we can route any matrix in T(R,C), we can route any other matrices with smaller column and row sums.
• Can route any demands with nodes less than full capacity.
Traffic Oblivious Routing• Operates in two phases.• Phase 1:
– A pre-determined fraction of the traffic entering the network at any node is distributed to every node j.
– independent of final destination.
• Phase 2:– As a result of Phase 1, each node receives
traffic destined for different destinations. – Routes to respective final destinations.
Traffic Oblivious Routing
Traffic Oblivious Routing• Implementing this scheme by:
– Forming fixed bandwidth tunnels between nodes.
– Refer as Phase 1 and Phase 2 tunnels.
• Bandwidth required for tunnels only depends on R and C values.
• Not on the unknown individual entries in the varying traffic matrix.
• Modeling tunnel demand next slide.
Traffic Oblivious Routing• Modeling tunnel demand.• Consider a tunnel of nodes i and j.• Phase 1: i sends to j• Phase 2: Traffic for j is split to i by , for all k.• Max. traffic needs to be routed from i to j in phase 2 is • Max. demand:
Traffic Oblivious Routing• Property 1: Routing oblivious to traffic
variations.
• Property 2: Provisioned capacity is traffic matric independent.
• Property 3: Complete utilization of provisioned capacity.
Traffic Oblivious Routing• Does not make any assumptions about T,
apart from row and column sum bounds.
• Does not require the network to detect changes in traffic.
• Handles variability in the traffic matrix set by effectively routing a transformed matrix.
• Depends only on row/column sum bounds and traffic distribution ratios.
• Not on a specific matrix.
Traffic Oblivious Routing• Handles any traffic by routing a transformed matrix .• If bounds and change, re-optimize
distribution ratios and adjust tunnels.• Can be formed into a linear programming
to minimize capacity on all links.• k: index the commodities. • s(k) source, d(k) destination.• : amount of flow of k on link e.• ui: hardware capacity of a node.
Traffic Oblivious Routing
Minimize link capacities
Flow conservation
Demand satisfaction
Within hardware capacity
Distribution ratios
Capacity Effectiveness• Results with no details in the paper.• Consider a 20-node and 33 bidirectional
links network. (represent US backbone)
• Ri’s and Ci’s are equal and normalized to 1.• Node capacities are identical, equals uR.• Below uR, routing infeasible.• Lowest uR =2.595• uR =2.8, bandwidth efficiency 94%.
Key Knowledge Gained• Violating edge constraints: roots of all
network deadlocks in DCN.• Edge and network problems can be
separated.• Edge: how to ensure capacity constraints
are not violated?• Network: how to balance loads and
separate services?
Outline• Valiant Load Balancing in VL2• Background• Proposed Routing Scheme• Further Ideas
– Clos Topology– Traffic-Oblivious, Randomized, Load-
Balanced routing– Randomization v.s. Dictation– Questions: Combining OpenFlow ?
Further Ideas• Clos topology
– Currently irreplaceable– So ideal for traffic oblivious, randomized
routing
• Network: symmetric randomization• Propagation delay: small no extra cost• Bouncing: Core switches natural
intermediate nodes
Further Ideas• Traffic-Oblivious Routing
– Localized routing to switches
• centralized / distributed split ratios computation– Need further research
• OpenFlow Controllers and Switches– Good for planning elephant flows– Should be combined with traffic oblivious and
randomized distributed routing– Randomization vs Dictation: Seemingly
Contradictory
How to adopt both concepts and implement into one scheme?
• Depending on flow types and scenarios.
• Prevent edges from being overflowed.– Design and placement of tenants and hosts.– Policies of edge switches, soft or hard.
When switches are able to do the routines,
only leave important and critical tasks to controllers,
When do systems initiate dictation / randomization?
• For major controllers– When critical tasks or situations occur.– What are critical tasks?
• For switches / secondary controllers– Reconfigure distribution ratios when
environment changes.– How to reconfigure?
• Logical topology / link capacities changed– Then switches start to reconfigure.– Define logical change?
What are relations between controllers and switches?
• Controllers plan resources allocation and routing when elephant flows or critical situations occur.
• Switches utilize resources left by controllers and perform optimization for distribution remaining traffic.
• Balance load between controllers and switches.