View
213
Download
0
Embed Size (px)
Citation preview
A Secure System-wide Process Scheduling across Virtual Machines
Hidekazu Tadokoro (Tokyo Institute of Technology)Kenichi Kourai (Kyushu Institute of Technology)Shigeru Chiba (Tokyo Institute of Technology)
1
Scheduling Problem across VMsServer consolidation using virtual
machines(VMs)To improve the resource utilization
VMs make it difficult to execute processes as administrators intendGuest OSes schedule only their processes
A low-priority process in a VM may interfere with a high-priority in other VMs
2
Hardware
VMM
VM VMOSIndexin
gWEB
OS
System-wide Process SchedulerNecessary for scheduling processes across
VMsIt can suppress the execution of less
important processBecause it knows important processes among
all VMsE.g. it can run the file indexing process only
when the whole system is idle
3
Indexing
VMM
system-wide scheduler
check VMs are idlerun indexing
VM VM
Issue: Difficult to ImplementImplementing a system-wide process
scheduler in the VMM is unsuitableVMM cannot recognize the process
Processes are abstraction of OSes
Passing information of processes to VMM requires modification of guest OsesModification of guest OSes is often
unacceptable
4
????
VMM
????
semantics gapwhat process is running?
VM VM
1) Guest-aware VM scheduling [Euro-Par’08 Kim et al.]
2) ask grain scheduling [HPCC’08 Kinebuchi et al.]
1), 2)
Issue: Vulnerable to a DoS AttackA process in a compromised VM can
prevent processes in other VMs through the schedulerE.g. a busy loop process can easily stop the
file indexing process in other VMsThe indexing is configured to run at idle time
5
Indexing
VMM
VM VMmalicious
loop
system-wide scheduler
never run VMs are NOT idle
Monarch SchedulerA system-wide process scheduler in the
VMMmanipulate internal data in guest OSes for
process schedulingrecognize the process
Hybrid scheduling to mitigate a DoS attackPeriodically switches between system-wide
process scheduling and original scheduling
6
Indexing
VMM
VM VM
WEB
Monarch Scheduler
change scheduling
Process Scheduling by the VMMVMM monitors and manipulates the run
queue and the process structure in guest OSesSuspending a process
Remove from the run queueRewrite its state to stop spontaneously
Resuming a processInsert it into a run queue
7
Monarch Scheduler
process
modify memory
run queue
VM
Hybrid SchedulingTo guarantee some CPU time to every
process
Periodically switches two modesControlled mode: performs system-wide
schedulingAutonomous mode: stops system-wide
schedulingVMM and guest OSes are perform their own
original scheduling
8
switch
Monarch Scheduler
malicious loop
indexing
VMcontrolled
VM
stop
Monarch Scheduler
malicious loop
indexing
VMVMautonomous
run freely
ImplementationWe implemented in Xen 3.4.2Supported guest OS is Linux 2.6 (x86_64)
Scheduler is invoked by timer interrupts in VMMPause a DomainU
To prevent conflict between the Monarch scheduler and the guest OS
Get the CPU time of each processSchedule when the controlled mode
9
Xen Monarch Scheduler
process
run queue
DomainU
interrupt schedule
Accessing Kernel DataThe Monarch scheduler accesses the
internal data of guest OSes based on their informationObtain debug information from kernel image
in advanceTranslate virtual addresses of domainU into
machine addresses of the VMM at run timePage tables of guest OSesP2M tables
10
virtual address
Xen VMM
DomU
P2M table
machine memory
page tablekernel image
Finding process structuresThe Monarch scheduler traverses a
process listEvery process structure is linked to the list
The starting point is init_taskThe address of init_task is invariant in each
kernel image
11
init_taskLinux kernel
Finding Run QueuesThe Monarch scheduler finds a run queue
for each v-CPUThe address is unknown until boot of the
guest OSThe number of v-CPUs is not determined until
boot
The starting point is GS register of each v-CPUThe GS points x8664_pda, which contains a
pointer to a run queue
12
struct x8664_pda { task_t* current; ulong data_offset;…};
x8664_pda
run queue
Linux memory
data_offset +
PER_CPU_RUNQUEUES
GS register
13
Guaranteeing ConsistencyThe Monarch scheduler checks a lock of
the data structureTo guarantee that the guest is not accessing
the data whenever the Monarch scheduler accesses it
Acquiring the lock is not neededThe domain is paused
schedule() { spin_lock(runqueue); RUN QUEUE OPERATION spin_unlock(runqueue);}
scheduler of Linux OS
Monarch Scheduler
runqueuespinloc
kunlock
checklock
14
Monitoring Process TimeThe Monarch scheduler records the
execution time of each processIt tracks the switches of virtual address
spacesBy trapping modification of the CR3 register
It binds virtual address spaces to processesBy using process information in guest Oses
Time recorded by guest OSes is inaccurate
Monarch Scheduler
CR3
process
track change of CR3bind CR3 to process
15
ExperimentsExamining overheadsScheduling overheadsMonitoring overheadsPerformance degradation
Examining the scheduling behaviorSystem-wide idle-time schedulingHybrid scheduling with the idle-time
scheduling
Examining the impact of update the guest OS
Core 2 Duo 2.4 GHz Memory 6GBXen 3.4.2Dom0: Linux 2.6.18.8DomU: Linux 2.6.16.33 (1GB)
16
Scheduling OverheadsTime for traversing the process listChange the number of processes in one VMChange the number of VMs with fixed
number of processes
Traversing time is negligible in the schedule36ns/proc880ns/VM
0 1000200030004000500060000
50
100
150
200
250
total number of processes
exe
cu
tion
tim
e (
use
c)
0 1 2 3 4 50
2
4
6
8
10
12
14
16
18
total number of VMs
execu
tion
tim
e (
usec)
17
Monitoring OverheadsTime for recording the execution time of
processes with CR3The total number of context switches per
second
Overhead is negligibleTime to record (us/context switch)
Number of context switches (/sec)
Overhead(%)
Boot time 0.26 1467 0.04
Steady state
0.20 129 0.003
18
Performance DegradationThroughput and response time of lighttpdChanging scheduling interval
Only traversing the process listChanging the number of processes
Slightly degraded when the interval is 10ms
1 10 10016500
17000
17500
18000
18500
19000
19500
36 processes 500 processes
scheduling interval (msec)
thro
ug
hp
ut
(re
q/s
ec)
0.1 1 10 1000
0.10.20.30.40.50.60.70.8
36 processes 500 processes 2000 processes
scheduling interval (msec)
resp
on
se
tim
e
(mse
c)
Throughput Response time
19System-wide Idle-time SchedulingExamining that the Monarch scheduler
correctly archives the idle-time schedulingStop HyperEstraier whenever lighttpd runs
The Monarch scheduler archived the policyHyperEstraier degrades lighttpd without
scheduling
Xen VMM
lighttpdHyper
Estraier
VM2VM1
run only at idle time
0 10 20 30 40 50 60 700
20
40
60
80
100Hyper Estraier lighttpd
elapsed time (sec)
CP
U u
tiliza
tion (
%)
0 10 20 30 40 50 60 700
20
40
60
80
100Hyper Estraier lighttpd
elapsed time (sec)
CP
U u
tili
za
tio
n (
%)
without scheduler with scheduler
20
Hybrid SchedulingExamining the effectiveness of hybrid
schedulingChanging the ratio of the autonomous mode
The indexing process was executed according to the ratio of autonomous modeA steep rise of CPU utilization when more
than 80%
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
ratio of autonomous mode (%)
CP
U u
tili
zati
on
(%
)
0 10 20 30 40 50 60 700
20
40
60
80
100
Hyper Estraierlighttpd
elapsed time (sec)
CP
U u
tili
zati
on
(%
)
21Impact of Updating the Guest OS How much the Monarch scheduler has to
be modified when the Linux kernel is updatedInspected 33 versions of the Linux kernel 2.6
Version Change Difficulty
2.6.14 Internal structure of spinlock_t Easy
2.6.18 runqueue is renamed to rq Easy
2.6.23 Process scheduler changed from O(1) to CFS
Hardbut possible
2.6.30 The way to calculate the address of a run queue
Easy
22
Related WorkGuest-aware VM scheduling [Euro-Par’08 Kim
et al.]Guest OSes notify the VMM of their highest
priorityModification of guest OSes is required
Task grain scheduling [HPCC’08 Kinebuchi et al.]Guest OSes notify L4 of priorities of all
processesNot suitable for Xen due to frequent VM
switches
Task-aware VM scheduling [VEE’09 Kim et al.]Using gray-box knowledgeNot for process scheduling
23
ConclusionMonarch schedulerA secure system-wide process scheduler
running in the VMMmonitor the execution of processeschange the scheduling behavior of each guest
OSprovide hybrid scheduling to mitigate a DoS
attack
Future workCompletion of the support for Windows
guest OS