Upload
amazon-web-services
View
430
Download
1
Embed Size (px)
Citation preview
November 14, 2014 | Las Vegas, NV
Jason Stowe, Cycle Computing
Patrick Saris, USC
David Hinz, HGST
BETTER ANSWERS FASTER
2
We believe access to
Cloud Cluster Computing
accelerates invention & discovery
Cluster Computing is Everywhere
• Strategic Answers
• Speed & Agility
4
Jevons Paradox• UK in the 1860’s: “we
need a fixed amount of steam power”
• People thought:
More efficient coal use = use less coal
• Jevons disagreed!
Jevons Paradox
• Jevons was contrarian:
Increasing efficiency in turning coal to steam, making the interface simpler to consume, radically increases demand.
Cloud helps capacity…Fixed clusters are:
Too small when needed most,
Too large every other time…
But this work is hard to move: data scheduling, encryption, multi-AZ, security, etc.
Cycle powers access at scale
Internal:
500 Servers,
100% Full
Data Workflow
Cloud Orchestration
Drug Designer
Cycle solutions help access
Cluster Container
40 years of drug design in 9 hours
3 new compounds, $4,372 in Spot
10,600
Servers
Molecule
Data
Molecule
Data
Burst
Thanks to cloud, people can:
Ask the right questions
Get better answers, faster
Record Scale, Enterprise Speed
• Very innovative work by:
– Patrick Saris, USC
– David Hinz, HGST
• Both will show the importance of:
– Asking the right question, regardless of scale
– Getting results faster to increase throughput
November 14, 2014 | Las Vegas, NV
Patrick Saris, University of Southern California
Biomass
5.6%
Hydroelectric
3.1%
Wind
2.0%
Solar: 0.4%
Geothermal
0.3%
Fossil Fuels: 79%
Nuclear: 10%
Source: U.S. Energy Information Administration,
Monthly Energy Review – Table 1.2
Renewables: 11%
+
Donor
Acceptor
-V
+
Donor
Acceptor
-V
IndigoChlorophyl
Graphite
fragments
0
5
10
15
20
25
0 5 10 15 20 25
Experiment
0
5
10
15
20
25
0 10 20
Calculation
Experiment
IndigoChlorophyl
Graphite
fragments
nitrogens
-3.66 eV
-2.65 eV
1
2
3
4
Mat Halls, Schrodinger Inc.
nitrogen
3,473
phenyl
3,473
553,855
+
Donor
Acceptor
-V
‘Band gap’
of parent
structure
Production Cycle Deployment First live deployment 2008
File System (PBs)
If an internal cluster
Exists.
Jobs & data
Blob data
(S3)
Cloud Filer
Glacier
Auto-scaling
external
environment
HPC
Cluster
Internal HPC
Blob data
(S3)
Cloud Filer
Glacier
Auto-scaling
external
environment
HPC
ClusterBlob data
Cloud Filer
Cold Storage
Auto-scaling
external
environment
HPC
Cluster
Scheduled
Data
Metric Count
Compute Hours of Work 2,312,959 hours
Compute Days of Work 96,373 days
Compute Years of Work 264 years
Molecule Count 205,000 materials
Run Time < 18 hours
Max Scale (cores) 156,314 cores across 8 regions
Max Scale (instances) 16,788 instances
How did we do this?
Auto-scaling
Execute Nodes
JUPITER
Distributed
Queue
Data
Automated in 8 Cloud Regions,
4 continents, Double resiliency
…14 nodes controlling 16,788
© 2014 HGST, INC.
David Hinz
Global Director, IS&T
Cloud and DataCenter Computing Solutions
Cost Effective
High Performance Computing
On Amazon Web Services
© 2014 HGST, INC. | HGST CONFIDENTIAL 48
Agenda
• Who is HGST and how is HPC used?
• HGST’s AWS HPC Journey
• Use of Cloudability for Cost Analysis and RI Planning
• What’s Next
© 2014 HGST, INC. | HGST CONFIDENTIAL 49
Capacity Enterprise
Performance Enterprise
Cloud & Datacenter
Enterprise SSD(+3 acquisitions in 2013)
7200 RPM &
CoolSpin
HDDs
Ultrastar®
Ultrastar® &
MegaScale DC™
10K & 15K
HDDs
PCIe
SAS
HGST History
Founded in 2003 through the combination of the hard drive
businesses of IBM, the inventor of the hard drive, and
Hitachi, Ltd (“Hitachi”)
Acquired by Western Digital in 2012
More than 4,200 active worldwide patents
Headquartered in San Jose, California
Approximately 42,000 employees worldwide
Develops innovative, advanced hard disk drives (HDD),
enterprise-class solid state drives (SSD), external storage
solutions and services
Delivers intelligent storage devices that tightly integrate
hardware and software to maximize solution performanceUltrastar He6
with HelioSeal™ technology
© 2014 HGST, INC. | HGST CONFIDENTIAL 50
HPC Modeling and Simulation:
HGST’s Innovation Engine
• Improved Mechanical Innovation- Internal/External Mechanical Structural Analysis of HDD
- Critical Lubricant Attributes and Physics
- Airflow / He inside HDD
- Optimal combination of HDD head and media
compositions, spindle design, lubricants
- Storage Array: HDD location, airflow investigations
• Faster Aerial Density Improvements- Micro magnetic analysis for Heat Assisted Magnetic Recording (HAMR)
- Head-Medium Spacing (HMS)
Magnetic Medium
Magnetic Head Sensors
Magnetic Spacing
Trailing Edgeof Slider
HPC Doing The “Physics Work” Driving HGST Innovation
© 2014 HGST, INC. | HGST CONFIDENTIAL 51
HGST’s HPC AWS Evolution
• Stage 1 : PoC
- First HPC PoC: Sept 2013
• Stage 2 : Small Start
- 1st HPC Production Cluster: Nov 2013
• Stage 3 : Optimize Workloads And Flexibility
- AWS C3 Deployments Jan 2014
- 4th HPC Production Cluster: June 2014
• Stage 4 : Lower Cost
- Use Spot And Reserved Instances: Oct / Nov / Dec 2014
• Stage 5 : Business metrics
- Utilization and cost reports to HGST engineers : Dec 2014
© 2014 HGST, INC. | HGST CONFIDENTIAL 52
Stage 1 + 2 + 3: Shape and Scale Compute
Fluid Dynamics1.4x Overall Throughput Gain
MicroMagneticsMolecular Dynamics
Parameter Sweeps Throughput Gain
Model 1 1.23x - 1.78x
Model 2 1.01x – 1.67x
Model 3 1.23x – 1.69x
Model 4 Up to 2.7x
Simulation TypeThroughput
Increase
Head Drive Interface Vacuum Gaps 1.99x
Vacuum Gap "collection" 4.00x
Media Grains for HAMR (FePt/C) 2.03x
4 Carbon Molecule Clusters 5.67x
Molecular Dynamics 1.67x Initial Overall Throughput Gain
© 2014 HGST, INC. | HGST CONFIDENTIAL 53
Stage 3: Optimize For Workload Flexibility
Not all workloads and work require same compute resources 24 x 7 x 365
© 2014 HGST, INC. | HGST CONFIDENTIAL 54
Stage 4: Spot Instances For Advanced HDD Research
EBS
Submit jobs,
orchestrate HPC
clusters over VPN
Simulated 22 advanced
head designs across 3
materials possibilities
= 15 compute years
Used AWS c3 instances
6x faster run-time:
Ran in 5 days, not 30!
Total cost:
$4,026.02
New Drive
Head
Design
Workloads
Encrypt, route data to
AWS, return results
HPC Cluster
1024 Cores
Of Spot
Instances
© 2014 HGST, INC. | HGST CONFIDENTIAL 55
EBS
Submit jobs, orchestrate
HPC clusters over VPC
Run 1 Million drive head
designs = 70.75 core-years
90x throughput:
Ran in 8 hours, not 30 days!
3 days from idea to running!
70,908 cores, 729 TFLOPS
c3, r3 with Intel IvyBridge
Cost: $5,594, Spot
Instances
New Drive
Head
Design
Workloads
World’s Largest F500 Cloud RunTransforming drive design to store the world’s data
Encrypt, route data to
AWS, return results
Cluster with
70,908 Cores
Of Spot
Instances
© 2014 HGST, INC. | HGST CONFIDENTIAL 56
Stage 4 + 5: Optimize For Utilization and Cost
Great Solutions Are Available To Ease Optimization Effort
© 2014 HGST, INC. | HGST CONFIDENTIAL 57
Summary
• HGST’s HPC AWS Journey ~15 months
• Take The Right Steps Along The Journey To the Cloud
• Pick The Right Partners and Tools For Success
• Continually Evaluate Environment and Needs
© 2013 HGST, INC. | HGST CONFIDENTIAL 58
Thank You
Take advantage of efficiency
• Find more uses for this efficient, inexpensive compute
Please ask the right questions, get answers quickly
Go invent and discover!
THANK YOU
60