AWS no II Workshop de Computação Científica em Astronomia

Preview:

DESCRIPTION

Computação em Nuvem Interplanetária: Como a Nuvem é utilizada pela comunidade científica e de astronomia

Citation preview

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Computação em Nuvem Interplanetária

Como a Nuvem é utilizada pela comunidade científica e de astronomia

Julio Faerman

IIWCCA, São Paulo, Junho de 2014

Por quê?

Computação Científica?

Computação Científica!

a.k.a. Computação em Nuvem

Why AWS for HPC?

Low cost with flexible pricing Efficient clusters

Unlimited infrastructure

Faster time to results

Concurrent Clusters on-demand

Increased collaboration

HighPerformanceComputing

TOP500 64th fastest supercomputer on-demand

Nov 2013 Top 500 list

484.2 TFlop/s

26,496 cores in a cluster of EC2 C3

instances

LinPack Benchmark

Shrodinger & CycleComputing: computational chemistry

Simulation by Mark Thompson of the University of Southern California to see which of 205,000 organic compounds could be used for photovoltaic cells for solar panel material.

1.21 petaFLOPS (Rpeak)$68M => $33,000

Estimated computation time 264 years completed in 18 Hours

“… A 156,314-core …, totaling 1.21 petaFLOPS…, to simulate 205,000 materials, crunched 264 compute years in only 18 hours”

Amazon EC2

Resizable compute capacity

Complete control of your computing resources

Reduces the time required to obtain and boot new server instances to

minutes

On-DemandInstances

ReservedInstances

SpotInstances

Flexible Pricing Options

Pay as you go for computing power

Pay only for what you use, no up-front commitments or long-term contracts

Pay an up-front fee and receive a significant discount on the hourly pricing for that instance

Also helps ensure that compute capacity is available when needed

1- or 3-year terms

Bid on available EC2 capacity

If the current Spot Price is below your bid, your instances will start

If there is a capacity constraint, your instances may be evicted

.10 .09 .08

25612864              

32              

16              

8              

4              

2              

1              

1 2 4 8 16 32 64128

EC2 Compute Units (HP)

Mem

ory

(G

B)

General Purpose

Compute Optimized

GPU

Memory Optimized

Storage Optimized

Micro

10 GB Networking

30+ Instance Types

SSD instance store

Instance Families

48 TB local storage

32 vCPUs244 GB memory6.4 TB local SSD

Free TierAs low as ½¢/hour (RIs)

32 vCPUs (1 vCPU per hyperthread)2.8 GHz Intel Xeon E5-2680v2 (Ivy Bridge) processorSR-IOV Enhanced Networking

NVIDIA GRID GPU(“Kepler” GK104) 1,536 CUDA cores4GB of video memory 8 real-time 720p@30fps streams4 real-time 1080p@30fps streams

Demo!

http://www.infoq.com/presentations/JPL-cloud

Medo,insegurança,e Dúvidas?

SegurançaRegulamentaçãoPrivacidadeConfiabilidade (TF)CustoComplexidadeFaturamentoFuncionalidadeEscalabilidadeDisponibilidadeDesempenho (V/L)….

“Em muitos casos, a computação em nuvem pode ser mais segura que sua infraestrutura interna”

Tom Soderstrom, JPL CTO @ Re:Invent 2013 Hackaton

Armazenamento e Big Data

Volume

“2.5 quintilhões (10ˆ18) de bytes /

day”

“[D,D-1] > [0,2003]”

Velocidade

"... 0.5 segundos na página de pesquisa pode reduzir o tráfego em 20%”

"... 100ms de latência pode custar 1% em

vendas”

"... um corretor pode perder $4 milhões por ms se sua plataforma ficar 5

ms atrasada"

Variedade

“~49% dos dados estão em formatos

não estruturados ou semiestruturados”

Mais dados vs Algoritmos melhores

Storage Options

• Simple Storage Service (S3) e Glacier• Designed for high durability 99.999999999%

• Elastic Block Store (EBS)• Between 0.1% and 0.5% AFR per volume

• Local Instance Storage• Up to 48 terabytes per instance (spinning

disks)• Up to 5.7 terabytes of SSD storage

Amazon S3

Storage for the Internet. Natively online, HTTP access

Store and retrieve any amount of data, any time, from anywhere on the web

Highly scalable, reliable, fast and durable

Amazon Glacier• Meet your compliance requirements• Long term archival and near-line DR• Eleven nines of durability as S3 standard• All data encrypted using Server Side Encryption• Starting at $0.01/GB/month

“Every day our genome sequencers produce terabytes of data. As our company moves into the clinical space, we face a legal requirement to archive patient data for years that would drastically raise the cost of storage. Thanks to Amazon Glacier’s secure and scalable solution, we will be able to provide cost-effective, long-term storage and thereby eliminate a barrier to providing whole genome sequencing for medical treatment of cancer and other genetic diseases.”

- Keith Raffel, Senior Vice President and Chief Commercial Officer, Complete Genomics

Long-Term Data ArchivalDem

o!

Distributed Network Filesystem

HDFS, PVFS, Lustre, Gluster FS, Orange FS, NFS, …

Sample AWS ArchitectureProcessing large amounts of parallel data using a scalable cluster

Use commonly-available tools, includingGrid Engine, Condor, Star Cluster, Mesos, YARN, …

Customers running HPC workloads on AWS

http://aws.amazon.com/solutions/case-studies/

Waste

CustomerDissatisfaction

Actual demand

Predicted Demand

Rigid On-Premise Resources Elastic Cloud-Based Resources

Actual demand

Resources scaled to demand

Agility

Massive scale allows AWS to constantly reduce costs, while improving quality and reliability

TCO of cloud is much lower then on-premise IT when all costs are considered

Large scale datacenter-to-cloud migrations are happening now

Scale

On-Premise

Experiment infrequently

Failure is expensive

Less Innovation

Cloud

Experimentoften

Fail quickly at a low cost

More Innovation

Innovation

Referências: Estudos de Casos

• Cycle Computinghttp://www.cyclecomputing.com/blog/back-to-the-future-121-petaflopsrpeak-156000-core-cyclecloud-hpc-runs-264-years-of-materials-science/

https://www.youtube.com/watch?v=KhM5I0ABdvw

http://youtu.be/5vtVj5PIK_0

• JPL Curiosityhttp://aws.amazon.com/solutions/case-studies/nasa-jpl-curiosity/

• JPL MER e CARVEhttp://aws.amazon.com/solutions/case-studies/swfnasa/

Referências: Serviços AWS

• Amazon EC2

http://aws.amazon.com/ec2• Amazon S3

http://aws.amazon.com/s3• Amazon Elastic Map Reduce

http://aws.amazon.com/elasticmapreduce• Amazon Redshift

http://aws.amazon.com/redshift

Referências: Apresentações

• AWS Re:Invent 2013 Big Data + HPC Trackhttps://www.youtube.com/playlist?list=PLhr1KZpdzukf53KMFcF26hxv1eF5n9JKt

• JPL: Dare Mighty Thingshttp://www.infoq.com/presentations/JPL-cloud

Recommended