Upload
posy-mcdaniel
View
218
Download
0
Embed Size (px)
DESCRIPTION
09/01/2006NA4 Generic Application Meeting3 Project Description Duration: 36 Months Start Date: Sept 2004 Person/Months: 1024 Total Costs: 9.5 M € (6.3 M € from EU) Objective: Create a Digital Library Infrastructure that will allow members of dynamic virtual research organizations to create on-demand transient digital libraries based on shared computing, storage, multimedia, multi-type content, and application resources
Citation preview
Di l i gentA DIgital Library Infrastructureon Grid ENabled Technology
DILIGENT Project
Andrea ManziISTI-CNR, Pisa
09/01/2006 NA4 Generic Application Meeting 2
OutlineProject Description
Interaction with EGEE
gLite DILIGENT Infrastructures
gLite Experimentation
Problem Using gLite Services
DILIGENT Requirements
Future plans
09/01/2006 NA4 Generic Application Meeting 3
Project Description
Duration: 36 MonthsStart Date: Sept 2004Person/Months: 1024Total Costs: 9.5 M € (6.3 M € from EU)
15%
24% 61%
Technological developmentValidation ActivitiesInnovation Activities
Objective: Create a Digital Library Infrastructure that will allow members of dynamic virtual research organizations to create on-demand transient digital libraries based on shared computing, storage, multimedia, multi-type content, and application resources
09/01/2006 NA4 Generic Application Meeting 4
ParticipantsItalian National Research Coucil – ISTI (Italy, Scientific Co-ordinator) European Research Consortium for Informatics and Mathematics (France, Administrative Co- ordinator)
European Organization for Nuclear Research (Switzerland)
Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. – IPSI (Germany) University of Athens (Greece) University of Basel (Switzerland) University for Health Informatics and Technology Tyrol (Austria) University of Strathclyde (United Kingdom)
Engineering Ingegneria Informatica SpA (Italy) Fast Search & Transfer ASA (Norway) 4D SOFT Software Development Ltd. (Hungary)
European Space Agency – ESRIN (Italy) Scuola Normale Superiore (Italy) RAI Radio Televisione Italiana (Italy)
09/01/2006 NA4 Generic Application Meeting 5
DLCreation service
Service C
Service B
Service A
Service D
Service E
DILIGENT DL infrastructure
simulation
Speech recognition
Feature extraction
3D processing
ConsumersConsumers ProducersProducers
Implementation of Environmental Conventions
Research on Culture Heritage
09/01/2006 NA4 Generic Application Meeting 6
Interaction with EGEE
Coordination with EGEETechnical interactions Technical interactions
9 technical meetings (mainly with JRA1)9 technical meetings (mainly with JRA1)gLite mailing lists subscription:gLite mailing lists subscription:
[email protected]@cern.ch [email protected]@cern.ch
1 training on “1 training on “Grid Technologies for Digital Libraries”Grid Technologies for Digital Libraries”1 tutorial on “gLite Deployment”1 tutorial on “gLite Deployment”
Other interactionsOther interactions4 EGEE conferences (Cork, The Hague, Athens, Pisa)4 EGEE conferences (Cork, The Hague, Athens, Pisa)
09/01/2006 NA4 Generic Application Meeting 7
Interaction with EGEE
Feedback to EGEEOn EGEE activitiesOn EGEE activities
gLite bugs submission (JRA1)gLite bugs submission (JRA1)On DILIGENT projectOn DILIGENT project
statusstatusaccess to EGEE prototype testbeds (JRA1)access to EGEE prototype testbeds (JRA1)access to EGEE PPS testbed (SA1)access to EGEE PPS testbed (SA1)grid related DL requirements (JRA1, NA4)grid related DL requirements (JRA1, NA4)future plansfuture plans
09/01/2006 NA4 Generic Application Meeting 8
gLite DILIGENT Infrastructures
DILIGENT has 2 independent infrastructures (gLite v1.4)
Development infrastructureDevelopment infrastructureTesting infrastructureTesting infrastructure
Infrastructures are geographically distributed, linking 6 sites in Athens, Budapest, Darmstadt, Pisa, Innsbruck and Rome
Running gLite experimentationtests since July 2005
09/01/2006 NA4 Generic Application Meeting 9
Development Infrastructures
09/01/2006 NA4 Generic Application Meeting 10
Testing Infrastructure
Job ManagementServices
Data Management
Services
4DSOFT
InformationServices
CNR
SecurityServices
ENG
09/01/2006 NA4 Generic Application Meeting 11
gLite Experimentation
Goalstore/manage collections of objectsrun applications organized in DAGs store the application results for future usage
Tests plan Data Upload Job Submission Data transferData800K XML files of the Reuters corpus (from Aug96 to Aug97)ApplicationFeature extraction tool (JIRE Application)
Implementation of prototypes to test the feasibility of the proposed solutions
09/01/2006 NA4 Generic Application Meeting 12
gLite Experimentation – Data Upload
Two Mass Storage Systems (MSS) were tested: dCache and DPM
dCache:success rate: 69,06 % success rate: 69,06 % avg. rate: 16,18 s/fileavg. rate: 16,18 s/fileseveral problems!several problems!
DPM:success rate: 97,26 % success rate: 97,26 % avg. rate: 6,10 s/fileavg. rate: 6,10 s/file
UMIT UoA CNR FhG
0,00
5,00
10,00
15,00
20,00
25,00
30,00
Upload Rate
DPM
dCache
09/01/2006 NA4 Generic Application Meeting 13
gLite Experimentation – Job Submission
Jobs using dCache data MSS:
several problems!several problems!
Jobs using DPM data MSS:success rate: 100%success rate: 100%avg. rate: 5,77 s/fileavg. rate: 5,77 s/filecomparable performance comparable performance using 10 and 100 jobs using 10 and 100 jobs due to the small number due to the small number of available worker of available worker nodesnodes
12
1055,00
60,00
65,00
70,00
75,00
jobs
files
Execution Rate (dCache)
110
100
1000
100000,002,004,006,008,00
10,0012,0014,0016,00
jobs
files
Execution Rate (DPM)
09/01/2006 NA4 Generic Application Meeting 14
gLite Experimentation
DILIGENT Vs PPS infras.
Data uploadsimilar results (for DPM)similar results (for DPM)
Job submissionsimilar resultssimilar resultsDILIGENT dCache not DILIGENT dCache not considered (didn't work considered (didn't work with 1000 files)with 1000 files)
DIL (DPM)PPS (DPM)
100 jobs
10 jobs1 job
0,001,00
2,00
3,00
4,00
5,00
6,00
7,00
8,00
Execution Rate - 1000 files
DIL(dCache)
DIL (DPM)PPS (DPM)
3 thread
1 thread0,00
5,00
10,00
15,00
20,00
Upload Rate
09/01/2006 NA4 Generic Application Meeting 15
Process ManagementProcess Management
gLite ExperimentationThe experimental
DILIGENT DL exploits gLite storing and processing on demand the stored products on the GRID. This allows to produce usable end-user manifestations upon requests.
Storage Management
Content Management
Met
adat
a M
anag
emen
t
Index and Search Management
Authentication Authorization
gLite StorageBroker
Information Service
gLite JM
gLite SE
gLite WMS
Storage Management
User Interface
Inf. ServiceR-GMA
DVOSVOMS
09/01/2006 NA4 Generic Application Meeting 16
Problem Using gLite Services
gLite deploymentgLite architecture and configuration are complexgLite 1.0 was released in April 2005 (since then four new releases were made available)limited information available (it has been made available gradually)several bugs were found in deploying and using gLite (many are solved)
Software porting to 64 bit is not complete. Some gLite services ( WMS, CE) can’t be deployed on 64 bit machines.
09/01/2006 NA4 Generic Application Meeting 17
Problem Using gLite Services [cont]
Job submission: Slow Job execution phaseSlow Job execution phaseAnyway gLite job management system showed to be reliable:
more jobssame performance
Data upload:A lot of performance issues using DCache backend
gLite-put/gLite-get/gLite-rm simultaneous gLite-put/gLite-get/gLite-rm simultaneous large amount of small fileslarge amount of small files
DILIGENT needs 100% successful upload rate-> DPMdead-links on Fireman when glite-put ends with errors
09/01/2006 NA4 Generic Application Meeting 18
DILIGENT Requirements
DILIGENT aims to run executables that repeat the same operations for each input files belonging to a given collection.
Each single execution takes few minutes (or less) but it must be repeated for hundreds of thousands times (even millions).
These executables usually are organised in a DAG to deliver a more complex functionality
09/01/2006 NA4 Generic Application Meeting 19
DILIGENT Requirements [cont]
In order to support this framework, it should be possible:To query for the maximum number of CPUs concurrently available
in order to allow to a DILIGENT high level service to automatically prepare a DAG where each node will be entitled to process a partition of the data collection
To use parametric jobs/automatic partitioning on data
Submission of a same computation on a set of n input data should be more efficient than the submission of n jobs
To use Condor as LRMS (Local resource management System)
09/01/2006 NA4 Generic Application Meeting 20
DILIGENT Requirements [cont]
To support service certificateit should be possible to obtain a service certificate for a high level service
To specify a job specific prioritythe same user/service should be able to specify priorities for his/its own jobs
To specify a priority for a user or for a serviceit is required to prioritize the DILIGENT infrastructural services jobs with respect to the end-user services requests
09/01/2006 NA4 Generic Application Meeting 21
DILIGENT Requirements [cont]
To ask for on-disk encryption of dataIt should be possible to ask for encryption of the data on disk to prevent data leaks at the storage site level
To dynamically manage VO creation The creation of a new VO should be supported without deploying and configuration of services by hand
To dynamically support user/service affiliation to a VO
The user/service affiliation to a VO should be automathized as much as possible
09/01/2006 NA4 Generic Application Meeting 22
Future Plans
Monitor gLite developments and continue the current work of deploying gLite in DILIGENT infrastructures
Continue the ongoing gLite experimentation using DILIGENT and EGEE PPS infrastructures
Continue gridifying the following services needed in the DILIGENT DL experimentation.
Metadata ManagementContent ManagementIndex and Search ManagementProcess (workflow) Management
09/01/2006 NA4 Generic Application Meeting 23
Tips / Summary
DILIGENT has successfully installed and now maintains its own gLite infrastructures. DILIGENT development infrastructure can join the EGEE infrastructureAn active EGEE-DILIGENT collaboration has been established and this has been key for the achievement of our first goalsDILIGENT has identified a concrete set of open issues that we need to address. The gLite and DL experimentation activities have shown that we are on the right track
09/01/2006 NA4 Generic Application Meeting 24
DILIGENT Web Site http://www.diligentproject.org
DILIGENT Training DL http://diligent-training.isti.cnr.it
Experimental DL http://diligent-dl1.isti.cnr.it
Andrea Manzi [email protected]
Thank you