22
How VXLAN works on Linux Basic mechanism and Application to OpenStack and Docker ]中井悦司 / Etsuji Nakai Senior Solution Architect and Cloud Evangelist Red Hat K.K v1.1 2015/07/09

How VXLAN works on Linux

Embed Size (px)

Citation preview

Page 1: How VXLAN works on Linux

How VXLAN works on Linux

Basic mechanism and Application to OpenStack and Docker

]中井悦司 / Etsuji NakaiSenior Solution Architect

and Cloud EvangelistRed Hat K.K

v1.1 2015/07/09

Page 2: How VXLAN works on Linux

2

How VXLAN works on Linux

$ who am i

中井悦司 / Etsuji Nakai– Twitter @enakai00– Senior Solution Architect and

Cloud Evangelist at Red Hat.– The author of some OpenStack books.

Page 3: How VXLAN works on Linux

3

How VXLAN works on Linux

Contents

VXLAN basics OpenStack Neutron OVS Plugin VTEP implementation with Flannel References

Page 4: How VXLAN works on Linux

VXLAN basics

Page 5: How VXLAN works on Linux

5

How VXLAN works on Linux

The objective of VXLAN

Creating virtual L2 network over physical L3 network.

VXLANSwitch

VXLANSwitch

VXLANSwitch

Tokyo Osaka Fukuoka

10.1.0.0/16

10.1.1.0 10.1.2.0 10.1.3.0

Physical view

Logical viewfrom servers

Page 6: How VXLAN works on Linux

6

How VXLAN works on Linux

Packet encapsulation with VXLAN header

VXLAN encapsulates L2 packet inside L3 packet.

VXLANSwitch

VXLANSwitch

Tokyo Osaka

Dest Addressyy.yy.yy.yy

OriginalPacket

Source Addressxx.xx.xx.xx

OriginalPacket

VXLAN Header

xx.xx.xx.xx yy.yy.yy.yy

OriginalPacket

Page 7: How VXLAN works on Linux

7

How VXLAN works on Linux

Page 8: How VXLAN works on Linux

8

How VXLAN works on Linux

The fundamental problem of L2 over L3

How to find the correct location of packet destination?

How did you know thatthe destination is in Osaka!?

VXLANSwitch

VXLANSwitch

Tokyo Osaka

Dest Addressyy.yy.yy.yy

OriginalPacket

Source Addressxx.xx.xx.xx

OriginalPacket

VXLAN Header

xx.xx.xx.xx yy.yy.yy.yy

OriginalPacket

Page 9: How VXLAN works on Linux

9

How VXLAN works on Linux

ARP resolution on L2 layer

VXLAN switches need to emulate the ARP resolution mechanism.

IP  10.1.2.0MAC zz:zz:zz:zz:zz:zz

① ARP Request"What's the MAC for IP 10.1.2.0?"

② ARP Reply "zz:zz:zz:zz:zz:zz"

Dest IP10.1.2.0

Source IP10.1.1.0 PayloadDest MAC

zz:zz:zz:...Source MACxx:xx:xx:...

L3 headerL2 header

IP  10.1.1.0MAC xx:xx:xx:xx:xx:xx

④ Send L2 packet to "zz:zz:zz:zz:zz:zz"

③ Port <-> MAC association is recorded in MAC table

Page 10: How VXLAN works on Linux

10

How VXLAN works on Linux

Additional features for L2 over L3

Packet encapsulation is not enough for L2 over L3. VXLAN switches need to implement the following features.– ARP resolution: Need to reply to ARP request from local servers without

broadcasting the ARP packet.– Destination search : Need to find the destination location corresponding to the

destination MAC.

The VXLAN endpoint providing these features is referred as "VTEP".

ARP Reply「zz:zz:zz:zz:zz:zz」

Dest "zz:zz:zz:zz:zz:zz" is located in Osaka.VXLAN

Switch

Tokyo

xx.xx.xx.xx

① ARP Request"What's the MAC for IP 10.1.2.0?"

④ Send L2 packet to "zz:zz:zz:zz:zz:zz"

Page 11: How VXLAN works on Linux

11

How VXLAN works on Linux

Page 12: How VXLAN works on Linux

12

How VXLAN works on Linux

Variations of VTEP implementation

To implement VTEP features, there must be some mechanism to share the tuple (MAC, IP Address, Location) of all servers.

The followings are some variations of VTEP implementation.– Exchange MAC/IP information using L3 multicasting among switches.– Use SDN controller as a central MAC/IP database.– Use local agent and virtual VXLAN switch running on Linux servers.

Page 13: How VXLAN works on Linux

OpenStack Neutron OVS Plugin

Page 14: How VXLAN works on Linux

14

How VXLAN works on Linux

ML2 l2population driver

In the case of OpenStack Neutron OVS plugin, VXLAN encapsulation is done on the local Open vSwitch on compute nodes.– MAC/IP information is sent by L2 agent and populated by l2population ML2 driver.– The l2population driver populates the following entries in OVS.

• FDB (forwarding database): a lookup table to find a destination node corresponding to the dest MAC address.

• Flowtable entries for replying to ARP requests from local VMs.

VM

OVS (br-int)

VM

l2populationdriver

Messaging server(RabbitMQ)

VM

OVS (br-int)

VM

l2populationdriverL2 Agent L2 Agent

① Attaching new VM

② Send MAC/IP information

③ Populate flow table in OVS

Page 15: How VXLAN works on Linux

15

How VXLAN works on Linux

Reference : ML2 – Address Population– http://assafmuller.com/2014/02/23/ml2-address-population/

Page 16: How VXLAN works on Linux

VTEP implementationwith Flannel

Page 17: How VXLAN works on Linux

17

How VXLAN works on Linux

Overlay network with Flannel

Flannel is a opensource tool to create overlay network for Docker containers. It's often used with Kubernetes.– It uses Linux kernel's native VXLAN devices for packet encapsulation.– Flannel daemon dynamically populates FDB and ARP table according to the

kernel requests via the "L2/L3 MISS" notification mechanism.• The mechanism is originally named as "DOVE extensions"• https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?

id=e4f67addf158f98f8197e08974966b18480dc751

– The IP/MAC information is shared with the backend KVS (etcd).

etcd

Physical network192.168.122.0/24

Minion

flannel.1

Minion

flannel.1

Internal network for container communication10.1.0.0/16

Minion

flannel.1

VXLAN device

Page 18: How VXLAN works on Linux

18

How VXLAN works on Linux

Kernel's DOVE extensions

You can use the native VXLAN device with the current Linux kernel.– You don't necessarily need OVS for using VXLAN.– It's just like using the traditional VLAN device with Linux :)

VTEP features are implemented with a userland agent via "L2/L3 MISS" notification mechanism. (The notification is sent via netlink.)– L3MISS

• The kernel asks the agent to populate the local ARP table when necessary instead of broadcasting the ARP request packet.

– L2MISS• The kernel asks the agent to populate FDB when necessary.

# ip -d l show flannel.13: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT link/ether 82:ce:d5:09:06:2c brd ff:ff:ff:ff:ff:ff promiscuity 0 vxlan id 1 local 192.168.122.101 dev eth0 srcport 0 0 dstport 8472 proxy l2miss ageing 300

# bridge fdb show dev flannel.156:e1:c1:d6:b7:51 dst 192.168.122.102 self

# cat /proc/sys/net/ipv4/neigh/flannel.1/app_solicit3

Page 19: How VXLAN works on Linux

19

How VXLAN works on Linux

Reference: Kernel patch - add DOVE extensions for VXLAN– https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?

id=e4f67addf158f98f8197e08974966b18480dc751

Page 20: How VXLAN works on Linux

References

Page 21: How VXLAN works on Linux

21

How VXLAN works on Linux

References

ML2 – Address Population– http://assafmuller.com/2014/02/23/ml2-address-population/

Kernel patch: add DOVE extensions for VXLAN– https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?

id=e4f67addf158f98f8197e08974966b18480dc751

FlannelのVXLANバックエンドの仕組み– http://enakai00.hatenablog.com/entry/2015/04/02/173739

Page 22: How VXLAN works on Linux

EMPOWER PEOPLE,

EMPOWER ENTERPRISE,

OPEN INNOVATION.