Upload
etsuji-nakai
View
739
Download
0
Embed Size (px)
Citation preview
How VXLAN works on Linux
Basic mechanism and Application to OpenStack and Docker
]中井悦司 / Etsuji NakaiSenior Solution Architect
and Cloud EvangelistRed Hat K.K
v1.1 2015/07/09
2
How VXLAN works on Linux
$ who am i
中井悦司 / Etsuji Nakai– Twitter @enakai00– Senior Solution Architect and
Cloud Evangelist at Red Hat.– The author of some OpenStack books.
3
How VXLAN works on Linux
Contents
VXLAN basics OpenStack Neutron OVS Plugin VTEP implementation with Flannel References
VXLAN basics
5
How VXLAN works on Linux
The objective of VXLAN
Creating virtual L2 network over physical L3 network.
VXLANSwitch
VXLANSwitch
VXLANSwitch
Tokyo Osaka Fukuoka
10.1.0.0/16
10.1.1.0 10.1.2.0 10.1.3.0
Physical view
Logical viewfrom servers
6
How VXLAN works on Linux
Packet encapsulation with VXLAN header
VXLAN encapsulates L2 packet inside L3 packet.
VXLANSwitch
VXLANSwitch
Tokyo Osaka
Dest Addressyy.yy.yy.yy
OriginalPacket
Source Addressxx.xx.xx.xx
OriginalPacket
VXLAN Header
xx.xx.xx.xx yy.yy.yy.yy
OriginalPacket
7
How VXLAN works on Linux
8
How VXLAN works on Linux
The fundamental problem of L2 over L3
How to find the correct location of packet destination?
How did you know thatthe destination is in Osaka!?
VXLANSwitch
VXLANSwitch
Tokyo Osaka
Dest Addressyy.yy.yy.yy
OriginalPacket
Source Addressxx.xx.xx.xx
OriginalPacket
VXLAN Header
xx.xx.xx.xx yy.yy.yy.yy
OriginalPacket
9
How VXLAN works on Linux
ARP resolution on L2 layer
VXLAN switches need to emulate the ARP resolution mechanism.
IP 10.1.2.0MAC zz:zz:zz:zz:zz:zz
① ARP Request"What's the MAC for IP 10.1.2.0?"
② ARP Reply "zz:zz:zz:zz:zz:zz"
Dest IP10.1.2.0
Source IP10.1.1.0 PayloadDest MAC
zz:zz:zz:...Source MACxx:xx:xx:...
L3 headerL2 header
IP 10.1.1.0MAC xx:xx:xx:xx:xx:xx
④ Send L2 packet to "zz:zz:zz:zz:zz:zz"
③ Port <-> MAC association is recorded in MAC table
10
How VXLAN works on Linux
Additional features for L2 over L3
Packet encapsulation is not enough for L2 over L3. VXLAN switches need to implement the following features.– ARP resolution: Need to reply to ARP request from local servers without
broadcasting the ARP packet.– Destination search : Need to find the destination location corresponding to the
destination MAC.
The VXLAN endpoint providing these features is referred as "VTEP".
ARP Reply「zz:zz:zz:zz:zz:zz」
Dest "zz:zz:zz:zz:zz:zz" is located in Osaka.VXLAN
Switch
Tokyo
xx.xx.xx.xx
① ARP Request"What's the MAC for IP 10.1.2.0?"
④ Send L2 packet to "zz:zz:zz:zz:zz:zz"
11
How VXLAN works on Linux
12
How VXLAN works on Linux
Variations of VTEP implementation
To implement VTEP features, there must be some mechanism to share the tuple (MAC, IP Address, Location) of all servers.
The followings are some variations of VTEP implementation.– Exchange MAC/IP information using L3 multicasting among switches.– Use SDN controller as a central MAC/IP database.– Use local agent and virtual VXLAN switch running on Linux servers.
OpenStack Neutron OVS Plugin
14
How VXLAN works on Linux
ML2 l2population driver
In the case of OpenStack Neutron OVS plugin, VXLAN encapsulation is done on the local Open vSwitch on compute nodes.– MAC/IP information is sent by L2 agent and populated by l2population ML2 driver.– The l2population driver populates the following entries in OVS.
• FDB (forwarding database): a lookup table to find a destination node corresponding to the dest MAC address.
• Flowtable entries for replying to ARP requests from local VMs.
VM
OVS (br-int)
VM
l2populationdriver
Messaging server(RabbitMQ)
VM
OVS (br-int)
VM
l2populationdriverL2 Agent L2 Agent
① Attaching new VM
② Send MAC/IP information
③ Populate flow table in OVS
15
How VXLAN works on Linux
Reference : ML2 – Address Population– http://assafmuller.com/2014/02/23/ml2-address-population/
VTEP implementationwith Flannel
17
How VXLAN works on Linux
Overlay network with Flannel
Flannel is a opensource tool to create overlay network for Docker containers. It's often used with Kubernetes.– It uses Linux kernel's native VXLAN devices for packet encapsulation.– Flannel daemon dynamically populates FDB and ARP table according to the
kernel requests via the "L2/L3 MISS" notification mechanism.• The mechanism is originally named as "DOVE extensions"• https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=e4f67addf158f98f8197e08974966b18480dc751
– The IP/MAC information is shared with the backend KVS (etcd).
etcd
Physical network192.168.122.0/24
Minion
flannel.1
Minion
flannel.1
Internal network for container communication10.1.0.0/16
Minion
flannel.1
VXLAN device
18
How VXLAN works on Linux
Kernel's DOVE extensions
You can use the native VXLAN device with the current Linux kernel.– You don't necessarily need OVS for using VXLAN.– It's just like using the traditional VLAN device with Linux :)
VTEP features are implemented with a userland agent via "L2/L3 MISS" notification mechanism. (The notification is sent via netlink.)– L3MISS
• The kernel asks the agent to populate the local ARP table when necessary instead of broadcasting the ARP request packet.
– L2MISS• The kernel asks the agent to populate FDB when necessary.
# ip -d l show flannel.13: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT link/ether 82:ce:d5:09:06:2c brd ff:ff:ff:ff:ff:ff promiscuity 0 vxlan id 1 local 192.168.122.101 dev eth0 srcport 0 0 dstport 8472 proxy l2miss ageing 300
# bridge fdb show dev flannel.156:e1:c1:d6:b7:51 dst 192.168.122.102 self
# cat /proc/sys/net/ipv4/neigh/flannel.1/app_solicit3
19
How VXLAN works on Linux
Reference: Kernel patch - add DOVE extensions for VXLAN– https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=e4f67addf158f98f8197e08974966b18480dc751
References
21
How VXLAN works on Linux
References
ML2 – Address Population– http://assafmuller.com/2014/02/23/ml2-address-population/
Kernel patch: add DOVE extensions for VXLAN– https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=e4f67addf158f98f8197e08974966b18480dc751
FlannelのVXLANバックエンドの仕組み– http://enakai00.hatenablog.com/entry/2015/04/02/173739
EMPOWER PEOPLE,
EMPOWER ENTERPRISE,
OPEN INNOVATION.