View
4
Download
0
Category
Preview:
Citation preview
Power-Saving in Large-Scale Storage Systems with Data Migration
Koji Hasebe, Tatsuya Niwa, Akiyoshi Sugiki, and Kazuhiko KatoUniversity of Tsukuba, Japan
Background
IT systems consume 1-2% of the total energy in the world.Green IT: A New Industry Shock Wave, Gartner Symp/ITxpo, 2007
In large data centers, storage systems consume <40% of the total power. StorageIO, Greg Sculz
Power-saving in storage systems is a central issue.
Previous Studies
WorkloadLow-power mode
Peak time Off-peak time
In the literature… MAID [Colarelli-Grunwald, '02], PDC [Pinheiro-Bianchini, '04]
DIV [Pinheiro et al., '06], Pergamum [Storer et al., '08]
RIMAC [Yao-Wang, '06], eRAID [Wang-Zhu-Li, '08]
Hibernator [Zhu et al., '05], PARAID [Waddle et al. '07], etc.
Commonly-observed technique:
Previous Studies
Limitations:• Central controller to manage data accesses• Relatively small number of disks (up to several dozen)
Harnik et al. [IPDPS'09] Propose the efficient allocation of replicated data
d1 d2 d3
Previous Studies
Limitations:• Central controller to manage data accesses• Relatively small number of disks (up to several dozen)
Harnik et al. [IPDPS'09] Propose the efficient allocation of replicated data
d1 d2 d3
Previous Studies
Limitations:• Central controller to manage data accesses• Relatively small number of disks (up to several dozen)
Harnik et al. [IPDPS'09] Propose the efficient allocation of replicated data
d1 d2 d3
Low-power mode
Motivation and Objective Apply the skewing technique to large storage systems Explore an efficient technique based on the data migration,
instead of the replication approach
Motivation and Objective Apply the skewing technique to large storage systems Explore an efficient technique based on the data migration,
instead of the replication approach
datadata data data data
Motivation and Objective Apply the skewing technique to large storage systems Explore an efficient technique based on the data migration,
instead of the replication approach
datadata data data data
Low-power mode
Central Idea (1)Underlying System
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12
Block 1 Block 2 Block 3 Block 4
Parent Children
Parent Child
Assume that 3 physical nodes are required at off-peak time May increase up to four-fold
Central Idea (1)Underlying System
P1 P2 P3
V1
V2
V3
V4
V5
V6
V7
V8
V9
P4 P5 P6 P7 P8 P9 P10 P11 P12
Block 1 Block 2 Block 3 Block 4
V1V2 V3 V4 V5
V6
V7V8V9
Managed by distributed hash table (DHT)
Cf. Chord [Stoica et al. '01]
Central Idea (1)Underlying System
V1
P1
V2
V3
V4
P2
V5
V6
V7
P3
V8
V9
P4 P5 P6 P7 P8 P9 P10 P11 P12
1
4
7
2
5
8
3
6
9
1
2
3
4
5
6
7
8
9
1
4
7
2
5
8
3
6
9
Block 1 Block 2 Block 3 Block 4
Central Idea (2)Migration of Virtual Nodes
V1
P1
V2
V3
V4
P2
V5
V6
V7
P3
V8
V9
P4 P5 P6
1
4
7
2
5
8
3
6
9
Block 1 Block 2
Overloaded
Central Idea (2)Migration of Virtual Nodes
V12/2
P1
V2
V3
V4
P2
V5
V6
V7
P3
V8
V9
P4 P5 P6
1
4
7
2
5
8
3
6
9
Block 1 Block 2
Overloaded
V11/2
Divide V1 into two
V9
V12/2
V11/2
Central Idea (2)Migration of Virtual Nodes
V12/2
P1
V2
V3
V4
P2
V5
V6
V7
P3
V8
V9
P4 P5 P6
4
7
2
5
8
3
6
9
Block 1 Block 2
Overloaded
V11/2
Central Idea (2)Migration of Virtual Nodes
V12/2
P1
V2
V3
V4
P2
V5
V6
V7
P3
V8
V9
P4 P5 P6
4
7
2
5
8
3
6
9
Block 1 Block 2
Overloaded
V11/2
Central Idea (2)Migration of Virtual Nodes
V12/2
P1
V2
V3
V42/2
P2
V5
V6
V7
P3
V8
V9
P4 P5 P6
4
7
2
5
8
3
6
9
Block 1 Block 2
Overloaded
V41/2
V11/2
Central Idea (2)Migration of Virtual Nodes
V12/2
P1
V2
V3
V42/2
P2
V5
V6
P3
V8
V9
P4 P5 P6
7
2
5
8
3
6
9
Block 1 Block 2
Overloaded
V41/2
V11/2V7
2/2V7
1/2
Central Idea (2)Migration of Virtual Nodes
V12/2
P1
V42/2
P2 P3 P4 P5 P6
Block 1 Block 2
V41/2
V11/2V7
2/2
V22/2
V32/2
V52/2
V62/2
V82/2
V92/2 V7
1/2
V51/2
V21/2
V81/2
V61/2
V31/2
V91/2
Central Idea (2)Migration of Virtual Nodes
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12
Block 1 Block 2 Block 3 Block 4
Parent Children
Parent Child
dd
d
dd
d
dd
dd d d
Power-Saving Algorithms Short-term optimization Extension Reduction
Long-term optimization
Power-Saving Algorithm 1Short-term Optimization (Extension)
Procedure1. Each physical node checks its own workload.2. If the workload exceeds its capacity, then one of the
virtual nodes is split and migrated to its child block.
V12/2
V2
V3
V42/2
V5
V6
V72/2
V8
V9
V11/2
V41/2 (5)
(8)V71/2
(2)
(6)
(9)
(3)
P1 P2 P3 P4 P5 P6
Parent Child
Power-Saving Algorithm 1Short-term Optimization (Extension)
Notes: Reusing the stored data in the previous day enables the
migration by copying the difference. The mapping of virtual nodes effectively skews the workload.
V12/2
V2
V3
V42/2
V5
V6
V72/2
V8
V9
V11/2
V41/2 (5)
(8)V71/2
(2)
(6)
(9)
(3)
P1 P2 P3 P4 P5 P6
Parent Child
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Problem
V1 V42/2
V5
V72/2
V9
V41/2
V71/2
P1 P2 P3 P4 P5 P6
Parent Child
V22/2
V32/2
V62/2
V82/2
V21/2 V3
1/2
V62/2
V82/2
The remaining capacity of physical nodes
The workload of each virtual node = 1
(1) (1) (2)
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Wrong migration
V1 V42/2
V5
V72/2
V9
V41/2
V71/2
P1 P2 P3 P4 P5 P6
Parent Child
V22/2
V32/2
V62/2
V82/2
V21/2 V3
1/2
V62/2
V82/2
(1) (1) (2)
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Wrong migration
V1 V42/2
V5
V72/2
V9
V41/2
V71/2
P1 P2 P3 P4 P5 P6
Parent Child
V22/2
V3 V6
V82/2
V21/2
V82/2
(1) (1) (2)
(0) (0)
Power-Saving Algorithm 2Short-term Optimization (Reduction)
The solution
V1 V42/2
V5
V72/2
V9
V41/2
V71/2
P1 P2 P3 P4 P5 P6
Parent Child
V22/2
V32/2
V62/2
V82/2
V21/2 V3
1/2
V62/2
V82/2
(1) (1) (2)
(0) (0) (0)
Power-Saving Algorithm 2Short-term Optimization (Reduction)
The solution
V1 V4
V5
V7
V9
P1 P2 P3 P4 P5 P6
Parent Child
V2
V32/2
V62/2
V8
V31/2
V62/2
(1) (1) (2)
(0) (0) (0)
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Procedure1. C → P: the information about the workloads for every virtual node2. P lists all possible combinations of a subset of physical nodes s.t. P can absorb
their virtual nodes
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Procedure1. C → P: the information about the workloads for every virtual node2. P lists all possible combinations of a subset of physical nodes s.t. P can absorb
their virtual nodes
P1 {P4, P5}
P2 {P4, P5}, {P5, P6}
P3 {P4, P5}
Candidates
V1 V4
V5
V7
V9
P1 P2 P3
V2
V32/2 V6
V8
(1) (∞) (∞)
V32/2
P4 P5 P6
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Procedure1. C → P: the information about the workloads for every virtual node2. P lists all possible combinations of a subset of physical nodes s.t. P can absorb
their virtual nodes
P1 {P4, P5}, {P4, P6}
P2 {P4, P5}, {P5, P6}
P3 {P4, P5}
Candidates
V1 V4
V5
V7
V9
P1 P2 P3
V22/2
V3 V6
V8
(1) (∞) (∞)
V22/2
P4 P5 P6
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Procedure1. C → P: the information about the workloads for every virtual node2. P lists all possible combinations of a subset of physical nodes s.t. P can absorb
their virtual nodes
3. P → C: the result of Step 24. C calculates the intersection for all possible combinations of the results.
P1 {P4, P5}, {P4, P6}
P2 {P4, P5}, {P5, P6}
P3 {P4, P5}
Candidates {P4, P5}
V1 V4
V5
V7
V9
P1 P2 P3
V2
V32/2 V62/2
V8
(1) (1) (2)
V32/2
P4 P5 P6
V62/2
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Procedure1. C → P: the information about the workloads for every virtual node2. P lists all possible combinations of a subset of physical nodes s.t. P can absorb
their virtual nodes
3. P → C: the result of Step 24. C calculates the intersection for all possible combinations of the results.
P1 {P4, P5}, {P4, P6}
P2 {P4, P5}, {P5, P6}
P3 {P4, P5}
Candidates {P4, P5}, {P5}
V1 V42/2
V5
V7
V9
P1 P2 P3
V2
V32/2 V6
V8
(1) (1) (2)
V32/2
P4 P5 P6
V41/2
Power-Saving Algorithm 2Short-term Optimization (Reduction)
Procedure1. C → P: the information about the workloads for every virtual node2. P lists all possible combinations of a subset of physical nodes s.t. P can absorb
their virtual nodes
3. P → C: the result of Step 24. C calculates the intersection for all possible combinations of the results.
P1 {P4, P5}, {P4, P6}
P2 {P4, P5}, {P5, P6}
P3 {P4, P5}
Candidates {P4, P5}, {P5}, {P4}
Solution
V1 V4
V5
V7
V9
P1 P2 P3
V2
V32/2 V62/2
V8
(1) (1) (2)
V32/2
P4 P5 P6
V62/2
Power-Saving Algorithm 3Long-term Optimization
To maintain effective power-saving, it requires load-balancing in each block.
Example:
V1
V2
V3
V4
V5
V6
V7
V8
V9
(1)
(4) (5)
(8)(7)
(2)
(6)
(9)
(3)
P1 P2 P3 P4 P5 P6
Parent Child
High workload
Power-Saving Algorithm 3Long-term Optimization
To maintain effective power-saving, it requires load-balancing in each block.
Example:
V4
V5
V6
V7
V8
V9
(4) (5)
(8)(7)
(6)
(9)
P1 P2 P3 P4 P5 P6
Parent Child
V12/2
V22/2
V32/2
V11/2
V21/2
V31/2
Low workload
Power-Saving Algorithm 3Long-term Optimization
To maintain effective power-saving, it requires load-balancing in each block.
Example:
V1
V2
V3
V4
V5
V6
V7
V8
V9
(1)
(4) (5)
(8)(7)
(2)
(6)
(9)
(3)
P1 P2 P3 P4 P5 P6
Parent Child
Power-Saving Algorithm 3Long-term Optimization
To maintain effective power-saving, it requires load-balancing in each block.
Example:
V1
V5
V9
V4
V2
V6
V7
V8
V3
(1)
(4) (5)
(8)(7)
(2)
(6)
(9)
(3)
P1 P2 P3 P4 P5 P6
Parent Child
Load is balanced
Purposes Evaluate the efficiency of skewing the workload. Evaluate the validity of long-term optimization.
Simulation environment
39
Evaluation (Simulation)
Number of physical nodes 800
Number of virtual nodes 10,000
Term of simulation 1 day
Migration condition Split:more than 90%Merge:less than 70%
Workload of all virtual nodes Initially at its lowest,increased until middle of the day.Gap was sixfold.
Virtual node groups Gap of the loads is twice.
40
Simulation Results (Average load)
• In the caseWithout long-term optimization:
57-69%With Long-term optimization:
67-74%
Long-term optimization algorithm improves the average load as expected.
Physical nodes run effectively, coping with the daily variation of workload.
Results
Time (hour)
Ave
rage
load
of a
ctiv
e ph
ysic
al n
odes
(%)
Simulation Results (Active nodes)
Long-term optimization saves onAverage:
7-14%Up to:
17-39%
41
Optimization improves the power consumption consistently and continually.
Results
Time (hour)
The
num
ber
of a
ctiv
e ph
ysic
al n
odes
Purposes Verify the efficiency of load intensive at real machine. Verify whether response time becomes below the desired time.
Response time:from sending a request until the data were loaded into memory in the server.
Experiment environment
42
Evaluation (Prototype implementation)
Number of physical nodes 40 •Xeon 3.60GHZ CPU x2•Memory about 2GB•HDD(SCSI) 36GB
Number of Files 60,000 x 1MB (total 60GB)
Term of experiment 1 day
Migration condition Split:over 90%,Merge:under 70%
Workload of all virtual nodes Initially at its lowest,increased until middle of the day.Gap is sixfold.
Virtual node groups Twice between two groups
Amount of each migration 10% of all the data
43
Response Time
• Average response time80msec
• Maximum response time534msec
Our algorithms can keep almost below desired response time.
Results
Time (hour)
Res
pons
e tim
e pe
r re
ques
t (m
s)
44
Average Load
• Overall average load:67% of the capacity
Can also skew the workload effectively as the simulation.
Results
Time (hour)Load
of a
ctiv
e ph
ysic
al n
odes
(%)
Number of Active Physical Nodes
• Migration is done onAverage: 0.14 virtual nodesMaximum: 20 virtual nodes
Our system adjusts the number of physical nodes to the variation of
workloads and reduces power effectively
Time (hour)
The
num
ber
of a
ctiv
e ph
ysic
al n
odes
The
num
ber
of m
igra
tions
Conclusions Power-saving method for large-scale distributed storage
systems. Short/Long-term optimization algorithms for reducing power
consumption.
Performance evaluation Simulation results showed that our method kept the workload
on Average: 67–74%
Prototype implementation results showed that Overall Average load was: 67% It can maintain a preferred response time
Future work Implement replication mechanism to improve reliability.
Improve the long-term optimization algorithm.
Recommended