View
225
Download
0
Category
Preview:
Citation preview
Performance and Energy Efficiency Evaluation of Big Data Systems
Performance and Energy Efficiency Evaluation of Big Data Systems
Presented by Yingjie ShiInstitute of Computing Technology, CAS
2013-10-31
BPOE 2013 | HPCChina 2013
Goals of Big Data SystemsGoals of Big Data Systems
Larger
GreenerFaster
BPOE 2013 | HPCChina 2013
Performance V.S. Energy EfficiencyPerformance V.S. Energy Efficiency
Performance
Energy EfficiencyFaster & More
PowerfulGreener &Cheaper
More servers Bigger clusters Powerful processors Sophisticated
processing algorithms
…
Lightweight servers Efficient processors Simpler processing
algorithms …
Tradeoff
Evaluation
BPOE 2013 | HPCChina 2013
Evaluation of Performance & Energy Efficiency Tradeoff
Evaluation of Performance & Energy Efficiency Tradeoff
How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Big Data Systems
How to get balance?The Implications from Benchmarking Three Big Data Systems
BPOE 2013 | HPCChina 2013
MotivationMotivation
If you can not measure it, you can not improve it. – Lord Kelvin
PUE(Power usage effectiveness): a measure of how efficiently a computer data center uses its power; specifically, how much of the power is actually used by the information technology equipment.
BPOE 2013 | HPCChina 2013
PUE & Its Variants PUE & Its Variants
Metric Time Organization Computing Formulas
PUE 2007
GreenGrid
DCiE 2008 GreenGrid DCeP 2008 GreenGrid pPUE 2012 GreenGrid PUE
Scalability2013 GreenGrid
Total Facility Energy
IT Equipment Energy
*100%IT Equipment Energy
Total Facility Energy
Total Facility Energy insidetheBoundary
IT Equipment Energy insidetheBoundary
*100%Actual
PUE
m
m
Pr
Quantityof ResourceConsumed Producing this Work
UsefulWork oduced
Total
BPOE 2013 | HPCChina 2013
MotivationMotivation
• Scenario1
Data Management Researcher
An Improved Data Classification AlgorithmDoes it contribute to greening the data centers?
Run the Algorithms on Data Center
Compare the PUEs
No Obvious Variations!
PUE can not measure the effectiveness of any changes made upon the data center infrastructure!
BPOE 2013 | HPCChina 2013
MotivationMotivation
• Scenario2
Data Center Administrators
Give a budget plan of the data center energyconsumption in the next year
Estimate the data volume based on the business development
How to estimate the energy increasement?
PUE provides little reference information for data center planning according to data scale
and application complexity
BPOE 2013 | HPCChina 2013
Calculation FrameworkCalculation Framework
PUE
AxPUE
BPOE 2013 | HPCChina 2013
Definition - ApPUEDefinition - ApPUE• ApPUE (Application Performance Power Usage Effectiveness): a
metric that measures the power usage effectiveness of IT
equipments, specifically, how much of the power entering IT
equipments is used to improve the application performance.
• Computation Formulas:
ApplicationPerformanceApPUE
IT Equipment Power
Data processing performance of applications
The average rate of IT Equipment Energy consumed
BPOE 2013 | HPCChina 2013
Definition - AoPUEDefinition - AoPUE• AoPUE (Application Overall Power Usage Effectiveness ): a metric
that measures the power usage effectiveness of the overall data center system, specifically, how much of the total facility power is used to improve the application performance.
• Computation Formulas:
ApplicationPerformanceAoPUE
Total Facility Power
The average rate of Total Facility Energy UsedApPUEAoPUE
PUE
BPOE 2013 | HPCChina 2013
Acquisition – Application PerformanceAcquisition – Application Performance
Application Category
Examples Metric
Service Application Search engine, Ad-hoc queries
Number of requests answered in unit time
Data Analysis Application
Data mining, Reporting, Decision support, Log analysis
Volume of data processed in unit time
Interactive Real-time Application
E-commerce, Profile data management
Number of transactions completed in unit time
High Performance Computing
Scientific Computing Number of floating-point operations in unit time
BPOE 2013 | HPCChina 2013
Acquisition – BenchmarkAcquisition – Benchmark
• Requirements of Benchmarks– Provide representative workloads for big data
applications
– Provide a scalable data generation tool
• BigDataBench– A big data benchmark suite open-sourced recently
and publicly available
– All the requirements are well fullfilled
BPOE 2013 | HPCChina 2013
Experiment OverviewExperiment Overview
• Testbed– Data center of 18 racks,362 servers– Sample 8 servers
• Workloads
• Two experiments– Different Applications– Different Implementation Algorithms
BPOE 2013 | HPCChina 2013
Experiments on Different ApplicationsExperiments on Different Applications
0
1
2
3
4
5
6
7
8
9
PUEApPUEAoPUE
BigDataBench SVM Sort Grep Linpack
17.2 11.5 269.9 179.7
BPOE 2013 | HPCChina 2013
Experiments on Different AlgorithmsExperiments on Different Algorithms
• Two Implementations for Sort– Several reducers with random sampling partitioning– One reducer without partitioning
10G 25G 50G 100G0
5
10
15
20
25
30PUE(Sort1)ApPUE(Sort1)PUE(Sort2)ApPUE(Sort2)
Data Size
BPOE 2013 | HPCChina 2013
ConclusionsConclusions
• We analyze the requirements of application-level energy effectiveness metrics AxPUE in data centers.
• We propose two novel application-level metrics ApPUE and AoPUE to measure the energy consumed to improve the application performance.
• The experiment results show that AxPUE could provide meaningful guidance to data center design and optimization.
BPOE 2013 | HPCChina 2013
Evaluation of Performance & Energy Efficiency Tradeoff
Evaluation of Performance & Energy Efficiency Tradeoff
How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers
How to get balance?The Implications from Benchmarking Three Big Data Systems
BPOE 2013 | HPCChina 2013
New SolutionsNew Solutions
……
BPOE 2013 | HPCChina 2013
Experimental PlatformsExperimental Platforms
Xeon (Common processor)
Atom ( Low power processor)
Tilera (Many core processor)CPU Type
Intel Xeon E5310 Intel Atom D510 Tilera TilePro36
CPU Core4 cores @
1.6GHz2 cores @ 1.66GHz
36 cores @ 500MHz
L1 I/D Cache 32KB 24KB 16KB/8KB
L2 Cache 4096KB 512KB 64KB
Basic InformationBrief Comparison
BPOE 2013 | HPCChina 2013
Benchmark SelectionBenchmark SelectionBigDataBench
A big data benchmark suite from big data applications
Respective applications
An innovative data generation tool
ApplicationTime
ComplexityCharacteristics
Sort O(n*log2n) Integer comparison
WordCount O(n)Integer comparison and
calculation
Grep O(n) String comparison
Naïve Bayes O(m*n) Floating-point computation
SVM O(n3) Floating-point computation
BPOE 2013 | HPCChina 2013
Metrics Metrics
Performance: Data processed per second (DPS)
Energy Efficiency: Application Performance Power Usage Effectiveness(DPJ)
BPOE 2013 | HPCChina 2013
Xeon Atom Tilera
DPS
DPJ
General ObservationsGeneral Observations
BPOE 2013 | HPCChina 2013
General ObservationsGeneral Observations
Data scale has a significant impact on the performance and energy efficiency of big data systems.
The performance and energy efficiency trends of different applications are diverse.
Xeon Atom Tilera
BPOE 2013 | HPCChina 2013
Xeon VS Atom – DPSXeon VS Atom – DPS
BPOE 2013 | HPCChina 2013
Xeon VS Atom – DPJXeon VS Atom – DPJ
BPOE 2013 | HPCChina 2013
Xeon VS Atom – DPS & DPJXeon VS Atom – DPS & DPJ500MB 1GB 10GB 25GB 50GB
100GB
SortDPSDPJ
3.670.87
4.511.08
1.890.45
1.540.36
1.360.32
1.400.33
WordcountDPSDPJ
2.270.55
2.380.58
2.740.61
2.840.61
2.820.62
2.790.60
GrepDPSDPJ
1.830.48
1.820.46
2.300.54
2.790.62
2.870.63
2.890.64
Naïve Bayes
DPSDPJ
3.830.89
3.890.87
4.521.01
4.640.99
4.540.97
4.580.90
SVMDPSDPJ
3.190.69
3.060.64
3.170.66
3.140.67
Xeon is more powerful than Atom on processing capacity.Atom is more energy –saving than Xeon when dealing
with simple computation logic applications.
BPOE 2013 | HPCChina 2013
Xeon VS Atom -- SummaryXeon VS Atom -- Summary
Xeon is more powerful than Atom on processing capacity.
Atom is energy conservation than Xeon when dealing with applications with simple computation logic.
Atom doesn’t show energy advantage when dealing with complex applications.
BPOE 2013 | HPCChina 2013
Xeon VS Tilera – DPSXeon VS Tilera – DPS
BPOE 2013 | HPCChina 2013
Xeon VS Tilera – DPJXeon VS Tilera – DPJ
BPOE 2013 | HPCChina 2013
Xeon VS Tilera – DPS & DPJXeon VS Tilera – DPS & DPJ500MB 1GB 10GB 25GB
SortDPSDPJ
3.670.48
3.390.45
2.410.31
2.600.34
WordcountDPSDPJ
5.190.67
5.040.65
7.350.87
7.780.92
GrepDPSDPJ
3.600.51
3.520.48
7.450.94
9.931.21
Naïve BayesDPSDPJ
5.910.75
5.780.70
7.590.89
7.940.92
Xeon is more powerful than Tilera on processing capacityTilera is more energy-saving than Xeon when dealing with the simple computation logic and I/O intensive applicationsTilera don’t show energy advantage when dealing with complex applications
BPOE 2013 | HPCChina 2013
Xeon VS Tilera Xeon VS Tilera
The DPS of XeonThe DPS of AtomThe DPS of Tilera
BPOE 2013 | HPCChina 2013
Xeon VS Tilera Xeon VS Tilera
The DPS of Tilera
Tilera is more suitable to process I/O intensive applications
BPOE 2013 | HPCChina 2013
Xeon VS Tilera -- SummaryXeon VS Tilera -- Summary
36
Xeon is more powerful than Tilera on
processing capacity.
Tilera is more energy conservation than Xeon
when dealing with simple computation logic and
I/O intensive applications.
Tilera don’t show energy advantage when
dealing with complex applications.
Tilera is more suitable to process I/O intensive
applications.
BPOE 2013 | HPCChina 2013
ImplicationsImplications
The performance of a big data system is not only related to the hardware itself, but also the application type and data volume of workloads.
The weak processors aren’t suitable to deal with complex applications. Even they have lower TDP, they don’t show energy cost advantage.
BPOE 2013 | HPCChina 2013
Implications Cont.Implications Cont.Xeon generally has better processing capacity accompanied with high energy consumption, especially to some light scale-out applications.
Atom and Tilera show energy consumption advantage when dealing with light scale-out applications.
Tilera exerts energy advantage on processing I/O intensive application.
BPOE 2013 | HPCChina 2013
Recommended