29
1/27 Grid Computing 李孝治

Grid Computing

  • Upload
    nostrad

  • View
    243

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Grid Computing

1/27

Grid Computing

李孝治

Page 2: Grid Computing

2/27

Outline

Introduction to Grid Computing

Standards for Grid Computing

Scheduling for Grid Systems

Page 3: Grid Computing

3/27

1. Introduction to Grid Computing “Grid Services for Distributed System

Integration”Foster, Kesselman, Nick, and Tuecke,

IEEE Computer, June 2002.

Grid technologies and infrastructures support the sharing and coordinated use of diverse resources in dynamic, distributed virtual organizations (VO)

Page 4: Grid Computing

4/27

Virtual Organizations (VO) includeCreation of virtual computing systemsGeographically distributed components operated by

distinct organizations with differing policies, Sufficiently integrated to deliver the desired QoS.

Resources:Computational devices, Networks, Online instruments, Storage archives, etc.

Page 5: Grid Computing

5/27

“What is Grid? A Three Point Checklist” http://www-ip.mcs.anl.gov/~foster

/Articles/WhatIsTheGrid.pdf

1) coordinates resources that are not subject to centralized control …

2)… using standard, open, general-purpose protocols and interfaces…

3)… to deliver nontrivial 重要 qualities of service.

Page 6: Grid Computing

6/27

“The Anatomy 解剖 of the Grid” http://www.globus.org/research/papers/anatomy.pdf

“Grid” computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation.

Page 7: Grid Computing

7/27

Grid Applications Application service providersStorage service providersConsultants engaged by a car manufacturer to perform

scenario evaluation during planning for a new factory

Members of an industrial consortium 財團 bidding on a new aircraft

A crisis management team and the databases and simulation systems that they use to plan a response to an emergency situation

Members of a large, international, multiyear high-energy physics collaboration

Page 8: Grid Computing

8/27

A de facto standard for Grid systems: OGSA (Open Grid Services Architecture) Globus Toolkit version 3 (GT3)Defines a uniform exposed service semantics

(the Grid service); Defines standard mechanisms for creating, naming,

and discovering transient Grid service instances; Provides location transparency and multiple protocol

bindings for service instances; Supports integration with underlying native platform

facilities.

2. Standards for Grid Computing

Page 9: Grid Computing

9/27

Core Grid service interfaces

Base resource management data transfer information services reservation monitoring

Data data management

Grid workload management diagnostics

Page 10: Grid Computing

10/27

Figure 1. OGSA Grid service.

Page 11: Grid Computing

11/27

Figure 2. Three different VO structures

(1) A simple hosting environmentA set of resources locatedwithin a single administrative domain

(2) A virtual hosting environmentA set of resources span two B2B domains

(3) Collective servicesa set of resources span two E2E domains

Page 12: Grid Computing

12/27

3. Scheduling in Grid Systems

“Scheduling strategies for mixed data and task parallelism on heterogeneous clusters and grids” O. Beaumont, A. Legrand, and Y. RobertProceedings of the Eleventh Euromicro Conference on

Parallel-Distributed and Network-Based Processing (Euro-PDP’03)

Page 13: Grid Computing

13/27

The complex application consists of a suite of identical, independent problems to be solved.

Each problem consists of a set of tasks.

There are dependences (precedence constraints) between these tasks.

Fork graph (task graph) A problem instance: e.g. a loop iteration of 4 tasks Operates on different data

Page 14: Grid Computing

14/27

Platform graph E.g.

4 resources and 5 communication links

Both with different rates Master node:

an initial node

The question for the master is to decidewhich tasks to execute itself, and how many tasks to forward to each of its

neighbors.

Each neighbor faces in turn the same dilemma.

Page 15: Grid Computing

15/27

Due to heterogeneity, the neighbors may receive different amounts of work (maybe none for some of them).

Because the problems are independent, their execution can be pipelined. At a given time-step, different processors may well

compute different tasks belonging to different problem instances.

Page 16: Grid Computing

16/27

ObjectiveTo determine the optimal steady state scheduling

policy for each processorThat is, the fraction of time spent computing, and the

fraction of time spent sending or receiving each type of tasks along each communication link, so that the (averaged) overall number of problems processed at each time-step is maximum.

Page 17: Grid Computing

17/27

The modelThe application

Let P(1), P(2), …, P(n) be the n problems to solve, where n is large.

Each problem P(m) corresponds to a copy G(m) = (V (m), E(m)) of the same task graph (V, E).

The number |V| of nodes in V is the number of task types. In the example of Figure 1, there are four task types,

denoted as T1, T2, T3 and T4.

Overall, there are n * |V| tasks to process, since there are n copies of each task type.

Page 18: Grid Computing

18/27

The architecture The target heterogeneous platform is represented by a

directed graph, the platform graph. There are p nodes P1, P2, … , Pp that represent the

processors. In the example of Figure 2 there are four processors, hence

p = 4. Each edge represents a physical interconnection. Each edge eij : Pi Pj is labeled by a value cij which

represents the time to transfer a message of unit length between Pi and Pj, in either direction.

We assume a full overlap, single-port operation mode, where a processor node can simultaneously receive data from one of its neighbor, perform some (independent) computation, and send data to one of its neighbor.

Page 19: Grid Computing

19/27

Execution time Processor Pi requires wi,k time units to process a task of type

Tk.

Assume that wi,k = wi ×δk, where wi is the inverse of the relative speed of processor Pi, and δk the weight of task Tk.

Communication time Each edge ek,l : Tk Tl in the task graph is weighted by a

communication cost datak,l that depends on the tasks Tk and Tl. It corresponds to the amount of data output by Tk and required as input to Tl.

The time to transfer the data from Pi to Pj is equal to datak,l × ci,j

Page 20: Grid Computing

20/27

Steady-state equations The (average) fraction of time spent each time-unit by Pi to

send to Pj data involved by the edge ek,l.

s(Pi Pj, ek,l) = sent(Pi Pj, ek,l) × (datak,l × ci,j) (1)

where sent(Pi Pj, ek,l) denotes the (fractional) number of files sent per time-unit

The (average) fraction of time spent each time-unit by Pi to process tasks of type Tk:

α(Pi, Tk) = cons(Pi, Tk) × wi,k (2)

where cons(Pi, Tk) denotes the (fractional) number of tasks of type Tk processed per time unit by processor Pi.

Page 21: Grid Computing

21/27

Activities during one time-unit

One-port model for outgoing communications

One-port model for incoming communications

Page 22: Grid Computing

22/27

Full overlap

Page 23: Grid Computing

23/27

Conservation laws

Page 24: Grid Computing

24/27

Page 25: Grid Computing

25/27

Theorem 1. The solution to the previous linear programming problem provides the optimal solution to SSSP(G).

Page 26: Grid Computing

26/27

Page 27: Grid Computing

27/27

ConclusionsWe have shown how to determine the best steady state

scheduling strategy for a general task graph and for a general platform graph, using a linear programming approach.

This work can be extended in the following two directions: On the theoretical side, we could try to solve the problem of

maximizing the number of tasks that can be executed within K time-steps, where K is a given time bound. [Initialization phase]

On the practical side, we need to run actual experiments rather than simulations.

End

Page 28: Grid Computing

28/27

Other scheduling for Grid systems Local grid scheduling techniques using

performance prediction Spooner, D.P.; Jarvis, S.A.; Cao, J.; Saini, S.; Nudd, G.R.; Computers and Digital Techniques, IEE Proceedings- , Volume: 150 Issue: 2 , March 2003

Ant algorithm-based task scheduling in grid computing Zhihong Xu; Xiangdan Hou; Jizhou Sun; Electrical and Computer Engineering, 2003. IEEE CCECE 2003. Canadian Conference on , Volume: 2 , May 4-7, 2003

Page 29: Grid Computing

29/27

Near-optimal dynamic task scheduling of independent coarse-grained tasks onto a computational grid Fujimoto, N.; Hagihara, K.; Parallel Processing, 2003. Proceedings. 2003 International Conference on , 6-9 Oct. 2003

Scheduling in a grid computing environment using genetic algorithms Di Martino, V.; Mililotti, M.; Parallel and Distributed Processing Symposium., Proceedings International, IPDPS 2002, Abstracts and CD-ROM , 15-19 April 2002

End