38
國國國國國國 國國國國國國 薛薛薛 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 100 Fall, Oct 28, Fri 678, DTH 104 國國國國國國 - 國國國 (1) - Virtualization(V 12N)

薛智文 [email protected] csie.ntu.tw/~cwhsueh

Embed Size (px)

DESCRIPTION

前瞻 資訊科技 - 虛擬化 (1) - Virtualization( V12N ). 薛智文 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 100 Fall, Oct 28, Fri 678, DTH 104. Preface. Steve Jobs (Apple, 1955-2011) Stay hunger, stay foolish. ( 求知若渴,虛心若愚。 ) Dennis Ritchie (C language, 1941-2011) - PowerPoint PPT Presentation

Citation preview

Page 1: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

國立台灣大學資訊工程學系

薛智文[email protected]

http://www.csie.ntu.edu.tw/~cwhsueh/100 Fall, Oct 28, Fri 678, DTH 104

前瞻資訊科技- 虛擬化 (1)

-Virtualization(V1

2N)

Page 2: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Steve Jobs (Apple, 1955-2011)Stay hunger, stay foolish. ( 求知若渴,虛心若愚。)

Dennis Ritchie (C language, 1941-2011)

Skype eBay (4.1B USD, 2005) Microsoft (8.5B USD, 2011)

Linux (Linus Torvalds, 1991)

Android (Danger, 2003 Google, 2005)

Meego (Intel Samsung, Feb 2010 )

Tizen (Intel Samsung [Nokia], Sep 2011)

Windows 8 (Microsoft, nVidia 2011)

IOS 5 (Apple, 2011)

廣達,台積電 (2011)

Preface

持飢保愚

/372

Page 3: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

IntroductionWhat is virtualization?

Why is virtualization difficult?

How to virtualize?

Case StudyInline Emulation

Domain 1

Q&A

Outline

/373

Page 4: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Virtual class

Virtual circuit

Virtual community

Virtual device

Virtual disk

Virtual host

Virtual keyboard

Virtual machine

Virtual market

Virtual memory

Virtual money

Virtual Private Network

Virtual reality

What is Virtualization ?

Etc.Etc.

VirtualizationVirtualization

RunningApplications(x-platform)

RunningApplications(x-platform)

SecuritySecurity

SharingHardwareResource

SharingHardwareResource

FullyUtilizing

Hardware

FullyUtilizing

Hardware

The creation of a virtual version of something.

/374

Page 5: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Types of Virtualization

Hardware/platform virtualization

Desktop virtualization

Software virtualizationOS-level, Workspace, Application

Storage virtualization

Data virtualization

Database virtualization

Network virtualization

/375

Page 6: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室/376

Page 7: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

How fast can virtualization achieve?

What kinds of applications can there be?

What problems it might incur?Technical

Security

Business

Politics

…Homework:

Send to TA a 3-5 page report answering any of the above or related questions.1-3 members per group, will be posted on course wiki.

A 5-minute talk/Q&A in the last hour of class.

Big Questions for Virtualization

/377

Page 8: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Why Virtualization is Difficult?

OS is moved to ringr1/ring3

On x86

Some instructionsSensitive Instructions

Cannot be trapped

0/1/3 Ring, e.g. x86_32

0/3/3 Ring, e.g. x86_64, ARM

OS

OS

Critical Instructions

Instructions

Sensitive Register

Instructions

SGDT, SIDT, SLDT

SMSW

PUSHF(D), POPF(D)

Protected System

Instructions

LAR, LSL, VERR, VERW

PUSH, POP

CALL, JMP, INT, RET

STR

MOV

/378

Page 9: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Hardware

Hypervisor, e.g. Xen

VM0 VM1 VMN…

Virtual Machine Monitor (VMM)Hypervisor

Hardware

Hosted VMM, e.g. VMware

VM0 VM1 VMN…

Host Operating System

Type I - Hypervisor Type II – Hosted VMM

VM : Virtual Machine, Guest OS + Virtual Devices

/379

Page 10: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Software Execution Modes in Virtualization Environment

Mode Physical mode Virtual mode

Description

Hypervisor Privileged N/A

For executing the hypervisor only.

Kernel User Privileged

For executing the kernel of a virtual machine.

User User User

For executing user processes of a guest OS.

/3710

Page 11: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

According to Popek and Goldberg† in 1974,Virtual machines can be constructed for a platform if

Sensitive Instructionsmight change the state of system resources

Privileged Instructionsmust be executed with sufficient privilege

The First Challenge of VirtualizationVirtualizable

Sensitive   Instructions⊆Privileged   Instructions

† G. J. Popek and R. P. Goldberg, “Formal requirements for virtualizable third generation architectures,” Commun. ACM, vol. 17, no. 7, pp. 412–421, Jul. 1974.

/3711

Page 12: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Binary translation Hypercall

How to Virtualize ?

Full Virtualizat

ion

Para Virtualiza

tion

Hardware Assisted Virtualization

Intel VT-x & AMD SVMTrap and emulate

/3712

Page 13: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Case Study

1. Inline Emulation†

2. Domain 1• with Insyde Inc.

/3713

† Yuan-Cheng Lee, Chih-Wen Hsueh, and Rong-Guey Chang, "Inline Emulation: An Optimization Technique for Virtualization on Embedded Systems," Proc. of the 17th International Conference on Real-Time and Embedded Computing Systems and Applications (RTCSA'11), Toyama, Japan, August 2011.

Page 14: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Motivation

The First Challenge of Virtualization

Idea of Inline Emulation

Design of Inline Emulation

Evaluation and Analysis

Conclusions

Inline Emulation

/3714

Page 15: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Virtualization is fast enough on PC with 90+% performance compared to the same non-virtualized OS.

We can further utilize multi-core embedded processors

To run multiple operating systems on a mobile phone…

Motivation

/3715

Page 16: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Related Work

Secure Xen on ARM (Samsung)

It proved virtualization is possible for ARM platform.

The PENAR project (University of Applied Sciences, Western Switzerland)

It integrated the source trees of Xen, RTLinux, and Linux for ARM.

OKL4 (Open Kernel Labs)

A hypervisor which adopts microkernel architecture for embedded systems

/3716

Page 17: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Issues on Virtualization for ARM

The most critical issue is:

ExampleMOVS PC, LR // move the value in link register to PC

It will cause unpredictable behavior when executed in user mode.SPSR: Saved Program Status RegisterCPSR: Current Program Status Register

Sensitive instructions Privileged instructions

/3717

Page 18: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

The Problematic Instructions (1/3)

Type IInstructions which executed in user mode will cause undefined instruction (UDI) exception

We call them Canonical Privileged Instructions.

ExampleMCR p15, 0, r0, c2, c0, 0

Move r0 to c2 and c0 in coprocessor specified by p15 for operation according to option 0 and 0

Operand-dependent operation

/3718

Page 19: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

The Problematic Instructions (2/3)

Type IIInstructions which executed in user mode will have no effect

ExampleMSR cpsr_c, #0xD3

Switch to privileged mode and disable interrupt

N Z C V Q -- J -- GE[3:0] -- E A I F T M[4:0]

31 0

ExecutionFlags

ExceptionMask

ExecutionMode

Program Status Register (PSR)

/3719

Page 20: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

The Problematic Instructions (3/3)

Type IIIInstructions which executed in user mode will cause unpredictable behaviors

ExampleMOVS PC, LR

/3720

Page 21: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Solutions

ComplexityBinary translation

HypercallInline emulation

Design High Low Low

Implementation Medium High Low

Runtime High Medium Low

Counterpart(in programming languages)

Virtual function Normal function Inline function

/3721

Page 22: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

For the ARM architecture, the instruction (TYPE III)

MOVS PC, LRChanges the program counter and switches to user mode.

However, it causes unpredictable behavior when executed in user mode.

Therefore, it is a sensitive instruction but not a privileged instruction.

The First Challenge of VirtualizationExample

Sensitive instructions Privileged instructions

/3722

Page 23: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Dynamic Binary Translation

The First Challenge of VirtualizationSolutions (1/2)

BL TLB_FLUSH_DENTRY…

TLB_FLUSH_DENTRY: MCR p15, 0, R0, C8, C6, 1 MOV PC, LR

BL TLB_FLUSH_DENTRY_NEW…

TLB_FLUSH_DENTRY: MCR p15, 0, R0, C8, C6, 1 MOV PC, LR

…TLB_FLUSH_DENTRY_NEW: MOV R1, R0 MOV R0, #CMD_FLUSH_DENTRY SWI #HYPER_CALL_TLB

Translation Basic Block

/3723

Page 24: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Virtualization APIs – hypercalls

The First Challenge of VirtualizationSolutions (2/2)

BL TLB_FLUSH_DENTRY…

TLB_FLUSH_DENTRY: MOV R1, R0 MOV R0, #CMD_FLUSH_DENTRY SWI #HYPER_CALL_TLB

Restore User Context & PC

SWI Handler

Hypercall Handler

……

LDR R1, [SP, #4]MCR p15, 0, R1, C8, C6, 1

/* In Hypervisor */

/* In Guest OS */

/3724

Page 25: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Hypercall

Guest OS

Hypervisor

SWI Handler

Hypercalls

Soft

ware

In

terr

up

t

Hyper Call Handler

reschedule?

NoYes

context switch

/3725

Page 26: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Idea of Inline EmulationThe Original Instruction

Hypercall

MOV R0, VIRT_ADDRMCR p15, 0, R0, C8, C6, 1

MOV R0, #CMD_FLUSH_DENTRYMOV R1, VIRT_ADDRSWI #HYPER_CALL_TLB

LDR R1, [SP, #4]MCR p15, 0, R1, C8, C6, 1

Restore User Context & PC

Hypercall Handler

……

Guest OS

Inline Emulation

Restore PC

Inline Emulation Handler

……

Guest OS

MOV R0, VIRT_ADDRMCR p15, 0, R0, C8, C6, 1

/* restore user context */LDMIA SP, [R0 – R14]MCR p15, 0, R0, C8, C6, 1

MCR p15, 0, R0, C8, C6, 1

/3726

Page 27: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Inline EmulationGuest OS

Hypervisor

SWI Handler

Inline Emulation

CanonicalPrivileged

Instructions(TYPE I)

UD

I E

xcep

tion

retu

rn t

o g

uest

Hypercalls

Soft

ware

Inte

rru

pt

Hyper Call Handler

reschedule?

No

Yescontext switch

UDI Handler

/3727

Page 28: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Design of Inline EmulationThe Main Handler

A handler for the instruction is found

No handler for the instruction was found

/3728

Page 29: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

The Issue of Finding an Inline Emulation Handler

It is hard to find a simple hash function.

Because the encoding of ARM instructions is complicated.

Instead, we can construct an efficient search table.

Because there are a few frequently used instructions.

Instruction Ratio (%)

mcr p15, 0, Rd, c3, c0, 0 58.44

mcr p15, 0, Rd, c7, c14, 1 39.73

mcr p15, 0, Rd, c8, c5, 1 0.49

mcr p15, 0, Rd, c8, c6, 1 0.49

mcr p15, 0, Rd, c7, c10, 4 0.24

mcr p15, 0, Rd, c2, c0, 0 0.23

mcr p15, 0, Rd, c7, c5, 0 0.11

mcr p15, 0, Rd, c8, c5, 0 0.08

mcr p15, 0, Rd, c8, c6, 0 0.08

mrc p15, 0, Rd, c7, c14, 3 0.11

Others <0.01

/3729

Page 30: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Example of Mto1 Search Table

Encoding of MCR instructionSyntax: MCR{cond} cp, op1, Rd, CRn, CRm, op2

mask value handler Set

0x0F1F0F10 0x0E130F10 handler_CR3 MCR 15, op1, Rd, c3, CRm, op2

0x0F1C0F10 0x0E100F10 handler_CR02 MCR 15, op1, Rd, {c0 - c2}, CRm, op2

0x0F100F10 0x0E100F10 handler_CRX MCR 15, op1, Rd, {c4 - c15}, CRm, op2

……

0x00000000 0x00000000 0x00000000 End of Table

cond 1110 op1 0 CRn Rd cp op2 1 CRm

31 0

* An entry E is matched if

/3730

Page 31: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Design of Inline EmulationDynamic Inline Emulation (DIE) Handler

Self-modifying

inlining the instruction

flushing caches

/3731

Page 32: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Design of Inline EmulationStatic Inline Emulation (SIE) Handler

/* data synchronization barrier */executing the hard-coded instructions

restoring user context & PC

/3732

Page 33: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Emulator Android emulator (ARMv5)

Memory 12MB for the hypervisor

32MB for the guest OS

Hypervisor Xen 4.0.1 for ARMv5

Guest OS Linux 2.6.29-Goldfish

Compilation Using GCC with debug (-g) flag

Evaluation and AnalysisThe Experiment Environment

/3733

Page 34: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Evaluation and AnalysisThe Distribution of Emulated Instructions

Instruction CRn, CRm, op2 Ratio(%)

MCR c3, c0, 0 58.44

  c7, c14, 1 39.73

  c8, c5, 1 0.49

  c8, c6, 1 0.49

  c7, c10, 4 0.24

  c2, c0, 0 0.23

  c7, c5, 0 0.11

  c8, c5, 0 0.08

  c8, c6, 0 0.08

MRC c7, c14, 3 0.11

Others <0.01

More than 98%

/3734

Page 35: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Evaluation and AnalysisThe Micro-Level Analysis (1/2)

Operation - Invalidating TLB

Mode (instructions) ImprovementPV/IE (%)USER UND SWI Total

A single entry (DIE handler)

PV 13.00 0.00 305.97 318.97613.39

IE 3.00 49.00 0.00 52.00

The entire TLB(SIE handler)

PV 11.00 0.00 305.80 316.80704.01

IE 3.00 42.00 0.00 45.00

/3735

Page 36: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Evaluation and AnalysisThe Micro-Level Analysis (2/2)

InstructionMode (instructions) Improvement

PV/IE (%)USER UND SWI Total

MCR p15, 0, Rd, c3, c0, 0(DIE handler)

PV 9.00 0.00 203.29 212.29424.57

IE 3.00 47.00 0.00 50.00

MCR p15, 0, Rd, c7, c14, 1(DIE handler)

PV 13.00 0.00 304.50 317.50566.94

IE 3.00 53.00 0.00 56.00

Inline emulation can achieve at least 4.24X performance of hypercalls in

most cases (about 98%)./3736

Page 37: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Evaluation and AnalysisThe Macro-Level Analysis

DataProcessing

DataTransfer

Branch SoftwareInterrupt

Coprocessorand Other

Total

Paravirtualization(instructions)

89.22M 91.28M 27.08M 48560 4.79M 212.42M

Inline Emulation(instructions)

89.04M 90.66M 26.93M 33658 4.93M 211.59M

(PV – IE) / PV (%) 0.20 0.68 0.53 30.69 -2.72 0.39

/3737

Page 38: 薛智文 cwhsueh@csie.ntu.tw csie.ntu.tw/~cwhsueh

資工系網媒所 NEWS 實驗室

Inline emulation :Reduces the efforts to port guest operating systems

Increases the handling of sensitive instructions (4-7x)

Increases the overall system performance (0.39%)

Future workOptimization for memory virtualization

Much higher the overall speedup is possible.

Conclusions

/3738