33
1 VLDB - Data Management in Grids VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani , J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul, Koréa, 11 September 2006 Design and experimentations of an Design and experimentations of an efficient data management service for efficient data management service for NES architectures NES architectures

1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

Embed Size (px)

DESCRIPTION

3 Introduction: the NES context Example: antenna positioning

Citation preview

Page 1: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

1

VLDB - Data Management in Grids VLDB - Data Management in Grids

B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe

Laboratoire d’Informatique de l’Université de Franche-Comté

Séoul, Koréa, 11 September 2006

Design and experimentations of an Design and experimentations of an efficient data management service for efficient data management service for

NES architecturesNES architectures

Page 2: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

2

OutlineOutline

Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Page 3: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

3

Introduction: the NES Introduction: the NES contextcontext

Example: antenna positioning

Page 4: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

4

Introduction: the NES Introduction: the NES contextcontext

Exec (EXTRACTION, img1,img2)

Agent (Broker)

RPC-based model

Servers (provide services)

Client

Page 5: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

5

Introduction: the NES Introduction: the NES contextcontext

Exec (ANTENNA, img3)

Agent

Data can be reused for further computations

Page 6: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

6

Introduction: the NES Introduction: the NES contextcontext

Exec (EXTRACTION, img1,img2)

Agent

It is necessary to allow the storage of some data Data persistency

Page 7: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

7

Introduction: the NES Introduction: the NES contextcontext

Exec (ANTENNA, &img3)

Agent

It is necessary to allow the storage of some data Data persistency

Page 8: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

8

Introduction: the NES Introduction: the NES contextcontext

Exec(ANTENNA, &img3)Exec(RENDU,&img3)Exec(ANTENNA,&img3)

Agent

It is necessary to take advantage of parallelism due to independant tasks

Data replication

Page 9: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

9

GoalGoal

To propose a data management service for NES architectures which

implements datapersistency and data replication

concepts in the most transparent way for end-users

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Page 10: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

10

OutlineOutline

Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Page 11: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

11

Related work: non-NES Related work: non-NES architecturesarchitectures

Data Grid context Separating data physical and logical view European Data Grid…

Grid Computing context Large number of widely distributed nodes GASS, LegionFS…

Stork Pre-placement tool Generally coupled with meta-scheduler

Concepts

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Page 12: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

12

Related work: non-NES Related work: non-NES architecturesarchitectures

Mainly storage and system oriented

Difficult to adapt to NES environments

Data transfers are explicitely performed at the client level

Lack of transparency

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Drawbacks

Page 13: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

13

Related work: NES Related work: NES architecturesarchitectures

Decreasing network traffic Between clients and servers Ensuring that no unnecessary data are transmitted

NetSolve Request Sequencing Distributed Storage Infrastructure (DSI)

Drawbacks Data management is performed for only one computation sequence

Data transfers are explicit at client level

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Concepts

Page 14: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

14

OutlineOutline

Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Page 15: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

15

IssuesIssues

Replicas consistency For update operations

Do all the replicas have to be updated ? Or all the replicas are independant copies ?

Data Storage To store data as close as possible to servers Physical limitations of storage resources

Security Secure access policy Data can be shared access rights

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

A NES data management service must address the following issues:

Page 16: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

16

IssuesIssues

Data localization For data item stored inside the platform To find where a data item is stored

Data identification A data item must be fully identified

a client does not have to know where its data are stored

Data handle = unique reference to a data item

Data redistribution Bandwith is better between servers than between clients and servers

Move data between computational servers

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

A NES data management service must address the following issues:

Page 17: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

17

OutlineOutline

Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Page 18: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

18

The data management The data management service: DTMservice: DTM

Data Tree Manager (DTM) Distributed as a part of the DIET platform Flexible enough to be implemented in other platform

Distributed Interactive Engineering Toolbox (DIET) NES CORBA-based platform Hierarchical architecture Master and Local Agents Performance forecasting tool (FAST)

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Basics

Page 19: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

19

The data management The data management service: DTMservice: DTM

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Architecture

Page 20: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

20

The data management The data management service: DTMservice: DTM

The Logical Data Manager It manages a list of tuples (data handle, owners)

data present in its sub-tree It provides the localization knowledge

The Physical Data Manager It manages a list of persistent data It stores data and provides them to its server It informs its parent when update operations (add, move, delete) occur

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Components

Page 21: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

21

The data management The data management service: DTMservice: DTM

The Data Mover It provides mechanisms for data transfers between Data Managers

Data transfer management and data recording are separatedIntegration of different transfer protocols: GridFTP, RFT…

The Replica Manager It sends replication orders to Data Mover It allows the choice of the best replica to be transferred (NWS tool)

It uses a distributed protocol no distinction between the original data and its replicas

Replicas are read-only but the architecture allows the implementation of any consistency technique

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Components

Page 22: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

22

The data management The data management service: DTMservice: DTM

Communiation occurs between DIET and DTM components Low bandwith consumption for data management

Updates operations are limited to sub-trees Again low bandwith consumption for data management

DTM minimizes the number of data copy operations (CORBA) Crucial for large data

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Architecture advantages

Page 23: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

23

The data management The data management service: DTMservice: DTM

Only end-users have the knowledge of the application they submit Only end-users have the knowledge of the data that must be managed

The persistence mode It allows to choose if data must be persistent or not

The data handleEnd-users do not need to know where data are stored

The API Based on the profile concept

Problem name + data or date handle + persistence mode

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

The end-user point of view

Page 24: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

24

OutlineOutline

Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Page 25: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

25

Experimental resultsExperimental results

Previous experiments show:The good scalability and low overhead of DTM

The following tests show:The relevance of the data persistency approachThe performances of the data replication policy

Platform: DTM deployed over two laboratories far from 100 km

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Description

Page 26: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

26

Experimental resultsExperimental results

1 MA - 2 LA and 2 servers locally interconnected (100 Mbits/s)

1 client in the remote site (16 Mbits/s)

Linear algebra application

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Data persistence benefits

Page 27: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

27

Experimental resultsExperimental results

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Data persistence benefits

Page 28: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

28

Experimental resultsExperimental results

1 MA - 6 servers

Computing the occurrences number of a letter in a file

Synchronous requests are sent to the platform

When data item are not present they are replicated

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Replication benefits

Page 29: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

29

Experimental resultsExperimental results

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Replication benefits

Page 30: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

30

Experimental resultsExperimental results

Medical imagery application

Input files (from 0.1 Mbytes up to 500 Mbytes)

Several extractions parameters are applied

Result = jpeg file

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Use case: Dividing Cubes

Page 31: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

31

Experimental resultsExperimental results

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Use case: Dividing Cubes

Page 32: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

32

ConclusionConclusion

Feasability for NES environments

Fully implemented and integrated in DIET since version 1.1

Promising experimental results

Normalisation proposition (GGF)

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

Page 33: 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

33

Future workFuture work

Finalization of the GGF proposal

Tests on the Grid5000 platform

Fault tolerance

Integration of DTM in data grids

VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006