Кластер БГУИР: расширенные возможности

Embed Size (px)

Citation preview

, . : , , , , . : BigData ;: .

: Rack 10 Units

. , 7 ( !). , . , .. , , . , - , , .

7Blade: GPU SuperBlade SBI-7127RG2 CPU Intel Xeon E5-265032 Gb RAM2x Tesla M2075 6 Gb RAMInfiniBand 4x QDR (40Gbps)Network 2x Gigabit Ethernet

1Blade: GPU SuperBlade SBI-7127RG2 CPU Intel Xeon E560624 Gb RAM2x SSD 80Gb4x HDD 300GbInfiniBand 4x QDR (40Gbps)Network 2x Gigabit Ethernet

, HPC , .. ...

1. Intel Xeon processor E5-2600 family; QPI up to 8.0 GT/s 2. Intel C602 Chipset 3. Up to 256GB RDIMM or 64GB UDIMM; 8x DIMM slots 4. Intel i350 Dual port Gigabit Ethernet 5. 4x QDR (40Gb) InfiniBand or 10GbE mezzanine HCA 6. IPMI 2.0, KVM over IP, Virtual Media 7. 1x SATA DOM up to 64GB 8. Integrated Matrox G200eW Graphics

Infiniband 40Gbit, IPv4 (.. IPoverIB) 10Gbit. , K-means .

CPU Intel Xeon E5-2650

Intel Xeon E5-2600

Sandy Bridge

2012

8

2

2000

2800 (1 2 )2700 (3 )2500 (4, 5 6 )2400 (7 8 )

L320

4 DDR 3

AVX, SSE1-4, EM64T, AES .

(double)~150 Gflops

GPU Nvidia Tesla M2075

Fermi

2011

448

1215

6

144 /

- (double)~500 Gflops

2 : 14 Tesla14. x 448 x 32 => 200704 , 65535 !

2 : 14 Tesla14 x 448 x 32 => 200704 , 65535 !

High performance computing (HPC)

( ) . .

( ) . .

HPC-

HPC . .

( ) . .

Alt Linux 7.0

TORQUE

gcc

OpenMP

OpenMPI

OpenCL

Nvidia CUDA Toolkit

OpenSUSE 13.2(SLES 11.4)

, .. AltLinux . OpenSUSE SLES.

0OpenMP + MPI + CUDA

1 CPU GPU

OpenMP + MPI

CUDA

: 60%

2 CPU

: 10 15%

Deep learnong

(EDA)

,

- . .. . , Mathematica 2014. LIGO, ,

...

... : (Albert@home, Asteroids@home, Cosmology@Home, Einstein@Home)

(ATLAS@Home)

(BURP, Electric Sheep)

(CAS@home)

(Climate Prediction)

(Collatz Conjecture)

(DENIS@Home)

(DistributedDataMining)

(Distributed.net)

(DreamLab)

(Folding@home) : https://en.wikipedia.org/wiki/List_of_distributed_computing_projects

.

!!!!

Infrastructure-as-a-Service (IaaS)
Platform-as-a-Service (PaaS)

, .

(IaaS, . Infrastructure-as-a-Service) , ,

(PaaS, . Platform-as-a-Service) ,

, - . . - , , , - . (SaaS, . Software-as-a-Service), . IaaS PaaS.

OpenStack

OpenStack , .

(IaaS, . Infrastructure-as-a-Service) , , , , , , . (PaaS, . Platform-as-a-Service) , Amazon e2, Microsoft Azure, ElasticHosts...

GlusterFS , , ...

Text

IaaS , . . PaaS ssh( .) RemoteDesktop/TeamVewer

BigData

. . .

. . ... , , .

LHPChadoop

2011 Yahoo , Hadoop, Hortonworks. Hortonworks . Ambari, , . . ...

HDFS

HDFS (Hadoop Distributed File System) , , , GoogleFS

BigData . HPC BigData , . ... , ., GoogleFS, HDFS (Hadoop Distributed File System) , , . HDFS ( ) , , ( , ) . . HDFS ( ), . : , , , .

HDFS LHPChadoop

GlusterFS , HDFS

HDFS , , Hadoop HDFS, Amazon S3 CloudStore[en] . , HDFS MapReduce-, , , NoSQL- HBase, Apache Mahout. HDFS Hadoop, ...- , ;- GlusterFS ;- ;- HDFS ;- Ethernet Infiniband.

Hadoop MapReduce

Hadoop HDFS MapReduce

MapReduce , Google .

YARN

Yet Another Resource Negotiator

YARN , .

YARN ,

YARN (. Yet Another Resource Negotiator ) , 2.0 (2013), . MapReduce, (JobTracker), YARN (ResourceManager), . YARN MapReduce-, , ; YARN .YARN , , .

Hive

Apache Hive Hadoop (.. HDFS+MapReduce) , .HiveQL SQL- HDFS

Apache Hive Hadoop (.. HDFS+MapReduce) , . Facebook, . Netflix, Amazon, Amazon Elastic MapReduce Amazon Web Services. SQL- HDFS HiveQL, MapReduce . . Bitmap index .

Pig

Pig Latin

User Defined Functions on Java, Python, JavaScript, Ruby or Groovy

lazy evaluation

extract, transform, load (ETL)

is able to store data at any point during a pipeline

declares execution plans

supports pipeline splits, thus allowing workflows to proceed along DAGs instead of strictly sequential pipelines

Pig , MapReduce Hadoop. Yahoo 2006. Pig Latin, Java MapReduce SQL. Java, Python, JavaScript, Ruby or Groov.: ; extract, transform, load (ETL) , , ; ; ; .

Mahout

Distributed Row Matrix API with R and Matlab like operators

Similarity Analysis

Collaborative Filtering

Classification

Clustering

Dimensionality Reduction note

Frequent itemset mining

etc.

Mahout . MapReduce, .

Mahout . MapReduce, . Mahout: Basic Linear Algebra; ; Collaborative filtering ; ; Frequent itemset mining .

Giraph

Giraph MapReduce.

Facebook: 200 4

Giraph MapReduce.Giraph: Facebook G 200 4

HBase

HBase features compression

in-memory operation

Bloom filters on a per-column basis

Replication across the data center

Atomic and strongly consistent row-level operations

Near real time lookups

cells no larger than 10 MB

1 and 3 column families per table

Time based versions

HBase NoSQL , Google BigTable. HDFS BigTable- Hadoop.

HBase NoSQL ; Java; Google BigTable. HDFS BigTable- Hadoop, . Facebook . , , , , , CAP consistency, availability, partition tolerance

Kafka

Apache Kafka .

Apache Kafka . LinkedIn , , . , -, , .

Storm

Fast

Fast

Scalable

Fault-tolerant

Reliable

Easy to operate

Apache Storm near real-time . MISD ( ).

Apache Storm real-time . Twitter MISD , .. .Fast 100 . -

Scalable

Fault-tolerant , ..

Reliable , .

Easy to operate

Spark speed

Logistic regression in Hadoop and Spark

.

100 Hadoop MapReduce 10

, Hadoop. Hadoop, MapReduce , , , , . 100 Hadoop MapReduce 10

Spark Ease of Use

Word count in Spark's Python API

Java, Scala, Python, R.

Scala, Python R.

Java, Scala, Python, R. 80 , Map Reduce, . Scala, Python R.

Spark Speed

Streaming, SQL, Graph processing and machine learning

SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming.

SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. . Spark Streaming , Storm, MISD , SIMD

Spark speed

Access diverse data sources including HDFS, Cassandra, Hbase, S3, Hive, Tachyon, and any Hadoop data source

, Hadoop. Hadoop, MapReduce , , , , . 100 Hadoop MapReduce 10

Zeppelin

Hadoop Spark. , Scala, Hive, SparkSQL, Linux Shell,

Z Hadoop Spark. . , Scala, Hive, SparkSQL, Linux Shell, iPython. , HDFS, NFS S3, Twitter ..

Zeppelin


|grep http,GET,POST,CONNECT...

Kafka

, ,

BigData

BD. . , , ,

- ..!

HP

1. BigData

2. BigData: Data Computing, Data Sciense

3.

, . . , . . BigData . , , .. - , , . , , . : , . , .