Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1©2019
Fastly’s Scalable Global NetworkTaiji TsuchiyaFastly K.K.
急成長を支えるFastlyスケーラブルグローバルネットワーク
2©2019
About us
• Provide Edge Cloud Platform
• Founded in March 2011
• HQ in San Francisco
• 504 Employees (April 2019)
– 33% of employees work remotely
• Listed on NYSE (May 2019)
3©2019
Fastly’s Global Network• 66 POPs and 58 Tbps capacity, 75 IX points
• Cache Servers: ~1,700
• Daily Requests: 600+ Billion
(As of 2019.09)
4©2019
Operations Team• Distributed team
世界中に分散したリモートチーム
– Most members are remote
– Follow the Sun shifts– APAC 0:00-6:00 UTC
(10:00-16:00 JST)
– EMEA 06:00-12:00 UTC
– US East 12:00-18:00 UTC
– US West 18:00-24:00 UTC ● 24h 365d oncall● NetOps / CacheOps● Peering / Circuit turn-up● POP build● Tooling
5©2019
Agenda
Fastlyが少人数の運用チームで、どのように
大規模グローバルネットワークを運用しているかを紹介します。
How do we expand/operate our global network with a small team?
● Scalable Network Architecture
● Network Automation
6©2019 Confidential
ScalableNetwork Architecture
7©2019
Fastly’s Scalable Network
● No Backbone
● No Router
● No Load Balancer
8©2019
No Backbone NetworkAll traffic flows via The Internet.
● Easy to add new POPs.● Easy to standardize Network configurations
Transit Transit IX Peer
AS54133
The InternetAS54133
The Internet
AS54133
The InternetAS54133
The InternetEnd User
ISP ContentOrigin
AS54133
The Internet
AS54133
The InternetIP anycast
9©2019
No Router Network Routing Software on the Switches.
● Router port is really expensive.Switch is reasonable to expand network globally.
Switch (Arista EOS)
Server Server Server
userspace
BGP service (BIRD)
LBservice
APIservice
BGP(BIRD)
BGP(BIRD)
BGP(BIRD)
Transit IX Peer
eBGP
eBGP
Metricsservice
Server
BGP(BIRD)
10©2019
No (Hardware) Load Balancer NetworkInbound Traffic
● The Internet -> Fastly POP○ ISPs choose BGP best path
● Switch -> Cache○ ECMP Load Balancing
+ Fastly LB app(Faild)https://www.fastly.com/blog/building-and-scaling-fastly-net
work-part-2-balancing-requests
FastlyPOP A
FastlyPOP B
ISP
Transit
IP anycast
Faild
Server
Faild
Server
Faild
Server
Faild
ECMPSwitch
Traffic
Traffic
IP anycast
Sync● Health check● Generate routes
to all servers
https://www.fastly.com/blog/building-and-scaling-fastly-network-part-2-balancing-requestshttps://www.fastly.com/blog/building-and-scaling-fastly-network-part-2-balancing-requests
11©2019
No (Hardware) Load Balancer NetworkOutbound Traffic (Cache -> The Internet)
● ECMP Load Balancing per Transit/Peer
● Use MPLS paths
to decide routing paths on Servershttps://pc.nanog.org/static/published/meetings/NANOG71/1438/20171002_Barroso_Developing_And_Evolving_v1.pdf
172.20.0.0/24172.20.0.0/24
Server
Switch ATraffic
Transit A Transit A
Switch B
ECMP
Label A Label B
Network Next hop MPLS Label172.20.0.0/24 Switch A Label A172.20.0.0/24 Switch B Label B
https://pc.nanog.org/static/published/meetings/NANOG71/1438/20171002_Barroso_Developing_And_Evolving_v1.pdfhttps://pc.nanog.org/static/published/meetings/NANOG71/1438/20171002_Barroso_Developing_And_Evolving_v1.pdf
12©2019 Confidential
Network Automation
13©2019
One-Command Network Operations
● Control Peer Traffic
● Control Transit Traffic
● Drain Traffic for Transit/Peer maintenance
● Network Configuration
14©2019
Control Peer Traffic
Traffic
10.0.0.0/2410.0.1.0/2410.0.2.0/24
10.0.0.0/2410.0.1.0/2410.0.2.0/24
PeerSwitch
PeerSlasher
FlowCollector
“netops slash --site XXX --provider XXX”
10.0.0.0/24: 3Gbps10.0.1.0/24: 2Gbps10.0.2.0/24: 1Gbps
- 3Gbps
switch & interface
15©2019
Network Configuration Workflow
Ansibly
“ansibly deploy switch-XXX --commit”
Peer
Cache
Build a new circuit● Call APIs● Send files● Validate Status
Repository(GitHub)
Review
Datastore
Pull RequestDay 0
Day 1
MergeCI tests
Ansibly
Dry run
SwitchSwitchSwitchSwitch
Datastore
SwitchSwitchSwitchSwitch
16©2019
Next Step: Full Automation
CallPerson
Check traffic graph
RunPeer Slash
SwitchSwitch
SwitchSwitch
CallAPIs
PeerSlasher
RunPeer Slash
SwitchSwitch
SwitchSwitch
Event Driven Platform(StackStorm)
PeerSlasher
Trigger
Current
Next step
17©2019
SummaryFastlyが少人数でグローバルネットワークを運用している裏側について共有しました。
How Fastly handles Global Network Operations.
● Scalable Network Architecture○ No Backbone
○ No Router
○ No Load Balancer
● Network Automation○ Control Peer Traffic
○ Network Configuration
○ Full Automation
18©2019
Discussion• Any Comment or Question for Global Network Operation?
コメントや聞きたいことがあればぜひ質問してください!
• Please comment if you have any Tips and knowledge about Global Network Operations.グローバルネットワークの運用について、
良い知見やTipsをお持ちの方がいらっしゃればぜひコメントいただきたいです!
• Was this talk helpful for you?What is the point if you feel it’s hard to introduce these idea to your network?
参考にしていただけそうなポイントはあったでしょうか?
もし自社ネットワークへの導入が難しい場合、どのポイントがハードルになるでしょうか?
何を解決できれば同じような仕組みを導入できそうでしょうか?
19©2019
Thank you!