23
vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo- Sliced Core Embedded Lab. Kim Sewoog Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella, Dongyan Xu 2013 USENIX Annual Technical Conference

vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

  • Upload
    nuala

  • View
    86

  • Download
    1

Embed Size (px)

DESCRIPTION

vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core. Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella, Dongyan Xu 2013 USENIX Annual Technical Conference. Embedded Lab. Kim Sewoog. Motivation. Pay-as-you-go: Server Consolidation - PowerPoint PPT Presentation

Citation preview

Page 1: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced

Core

Embedded Lab.Kim Sewoog

Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella, Dongyan Xu2013 USENIX Annual Technical Conference

Page 2: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Pay-as-you-go: Server Consolidation Save cost in running application and operational expenditure

Multiple VMs sharing the same core CPU access latency

Motivation

VM1 VM2 VM3 VM4

Hypervisor(or VMM)

Low I/OThroughput

Page 3: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Two basic stages Device interrupts are processed synchronously in the kernel Application asynchronously copies the data in kernel buffer

I/O Processing

VM1 VM2 VM3

CPU

Time

< Effect of CPU Sharing on I/O Processing >

IRQ Pro-cessing

Kernel Buf-fer

Application

IRQ processing delay

< I/O Processing Workflow >

Page 4: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Effect of CPU Sharing on TCP Receive

TCP Client

Hypervisor Shared Buffer

ScheduledVMs

DATA

DATA

VM1

VM2

VM3DATA

ACKACK

ACK

IRQProcessing

Delay

Page 5: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Effect of CPU Sharing on UDP Receive

UDPClient

Hypervisor Shared Buffer

ScheduledVMs

VM1

VM2

VM3

DATADATA

Shared Buffer

FullDropped

ApplicationBufferDATA

Page 6: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Effect of CPU Sharing on Disk Write

Application Kernel Memory Disk DriveScheduledVMs

VM1

VM2

VM3

DATA Kernel Mem-oryVM3

DATA

DATA

IRQProcessing

Delay

Page 7: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Reduce time-slice of each VM Causes significant context switch overhead

Intuitive Solution

Page 8: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Our Solution: vTurbo

Page 9: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

IRQ processing offloaded to a dedicated turbo core Turbo core : Any physical core with micro-slicing (e.g., 0.1 ms)

Expose turbo core as a special vCPU to the VM Turbo vCPU runs on a turbo core Regular vCPUs run on regular cores

Pin IRQ context of guest OS to turbo vCPU

Benefits Improved I/O throughput (TCP/UDP, Disk) Self-adaptive system

Our Solution: vTurbo

Page 10: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

vTurbo Design

Page 11: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

vTurbo Design

VM1 VM2 VM3

Regular Core

VM3VM1 VM2 VM3VM1 VM2

Turbo Core

IRQIRQ

BufBuf

Application

TimeData Data

Page 12: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

vTurbot’s Impact on Disk Write

Application Kernel Memory vTurboRegularCore

VM1

VM2

Kernel MemoryVM3

Disk Drive

DATA

VM1VM2

VM3

VM1VM2

VM3

VM1VM2

VM3

VM1VM2

VM3

VM1VM2

VM3

Page 13: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Kernel Buffer

Application Buffer

Effect of CPU Sharing on UDP Receive

UDPClient

Hypervisor Shared Buffer

Regular Cores

VM1

VM2

VM3

DATA

Shared BuffervTurbo

VM1VM2

VM3

VM1VM2

VM3

VM1VM2

VM3

VM1VM2

VM3

KernelBuffer

DATA

Page 14: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

DATA

ACK

Effect of CPU Sharing on TCP Receive

TCP Client

Hypervisor Shared Buffer

Regular Cores

VM1

VM2

VM3

vTurbo

VM1VM2

VM3

VM1VM2

VM3

VM1VM2

VM3

VM1VM2

VM3

KernelBuffer

Backlog Queue

Receive Queue

Application Buffer

Locked

DATA

Page 15: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Turbo cores are not freeMaintain CPU fair-share among VMs

Calculate the credits on both regular and turbo cores Guarantee the CPU allocation on turbo cores Deduct I/O intensive VMs’ credits on regular cores Allocate the deduction to non-IO intensive VMs

VM Scheduling Policy for Fairness

< total capacity among the regular and turbo cores >

< total capacity >

< each VM’s fair share of CPU >

< each VMs’ turbo core fair share >

< actual usage of the turbo core >

Page 16: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

VM hosts 3.2 GHz Intel Xeon Quad-cores CPU, 16GB RAM Assign an independent core to driver domain(dom0) Xen 4.1.2 Linux 3.2 Choose 1 core as Turbo core

Gigabit Ethernet switch(10Gbps for 2 experiments)

Evaluation

Page 17: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

File Read/Write Throughput: Micro-Benchmark

regular core <-> turbo core

Page 18: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

TCP/UDP Throughput : Micro-Bench-mark

Page 19: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

NFS/SCP Throughput : Application Benchmark

Page 20: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Apache Olio : Application Benchmark3 components

a web server to process user requests a MySQL database server to store user profiles and event information an NFS server to store images and documents specific to events

Page 21: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

ConclusionsProblem : CPU sharing affects I/O throughputSolution : vTurbo

Offload IRQ processing to a turbo-sliced dedicated coreResults :

Improve UDP throughput up to 4x Improve TCP throughput up to 3x Improve Disk write up to 2x Improve NFS’ throughput up to 3x Improve Olio’s throughput by up to 38.7%

Page 22: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

Reference CHENG, L., AND WANG, C.-L. “vbalance: Using interrupt load balance

to improve i/o performance for smp virtual machine”, In ACM SoCC (2012)

DONG, Y., YU, Z., AND ROSE, G. “SR-IOV networking in Xen: archi-tecture, design and implementation”, In WIOV (2008).

GORDON, A., AMIT, N., HAR’EL, N., BEN-YEHUDA, M., LANDAU, A., SCHUSTER, A., AND TSAFRIR, D. “ELI: baremetal performance for I/O virtualization”, In ACM ASPLOS(2012).

Page 23: vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core

THANK YOU !