Upload
anne-short
View
213
Download
1
Embed Size (px)
Citation preview
Forschungszentrum Jülichin der Helmholtz-Gesellschaft
Grid Computing at NICGrid Computing at NIC
September 2005
Achim Streit + [email protected]
2 Forschungszentrum Jülich
Grid Projects at FZJGrid Projects at FZJ
UNICORE 08/1997-12/1999 UNICORE Plus 01/2000-12/2002 EUROGRID 11/2000-01/2004 GRIP 01/2002-02/2004 OpenMolGRID 09/2002-02/2005
VIOLA 05/2004-04/2007 DEISA 05/2004-04/2009 UniGrids 07/2004-06/2006 NextGrid 09/2004-08/2007 CoreGRID 09/2004-08/2008 D-Grid 09/2005-02/2008
3 Forschungszentrum Jülich
a vertically integrated Grid middleware system provides seamless, secure, and intuitive access to
distributed resources and data used in production and projects worldwide features
intuitive GUI with single sign-on, X.509 certificates for AA and job/data signing, only one opened port in firewall required, workflow engine for complex multi-site/multi-step workflows, job monitoring extensible application support with plug-ins, production quality, matured job monitoring, interactive access with UNICORE-SSH, integrated secure data transfer, resource management, full control of resources remains, production quality, ...
4 Forschungszentrum Jülich
UsiteUsiteVsiteVsite
ArchitectureArchitecture
TSI
NJS
RMS
TSI
NJSAuthorization
GatewayAuthentication
opt. Firewall
Gateway
opt. Firewall
Client
Multi-Site Jobs
UUDB
SSL
Abstract
Non-Abstract
Disc RMS Disc
Vsite
TSI
NJS
RMS
UUDB
Disc
IDBIDB IDBIncarnation
opt. Firewall
Authorization
similar to Globus jobmanager fork LoadLeveler, (Open)PBS(Pro),
CCS, LSF, NQE/NQS, ... CONDOR, GT 2.4
similar to /etc/grid-security/grid-mapfile
Workflow-Engine Resource Management Job-Monitoring File Transfer User Management Application Support
5 Forschungszentrum Jülich
UNICORE ClientUNICORE Client
6 Forschungszentrum Jülich
UNICORE-SSHUNICORE-SSH uses standard UNICORE security mechanisms to
open a SSH connection through the standard SSH port
UNICORE-SSH button
7 Forschungszentrum Jülich
Automate, integrate, and speed-up drug discovery in pharmaceutical industry
University of Ulster: Data Warehouse
University of Tartu:Compute Resources
FZ Jülich:Grid Middleware
ComGenex Inc.:Data, User
Instituto di Ricerche Farmacologiche“Mario Negri”:User
Descriptors
QSAR
3D Output
< 2 hoursEPAECOTOXDatabase
Descriptors
QSAR
3D Output
> 5 days2802D
structuresdownloaded
Workflow Automation & Speed-upWorkflow Automation & Speed-up
8 Forschungszentrum Jülich
Workflow Automation & Speed-upWorkflow Automation & Speed-up
automatic split-up of data-parallel task
9 Forschungszentrum Jülich
Open Source under BSD license Supported by FZJ
Integration of own results andfrom other projects
Release Management Problem tracking CVS, Mailing Lists Documentation Assistance
Viable basis for many projects DEISA, VIOLA, UniGrids, D-Grid, NaReGI
http://unicore.sourceforge.net
@@
10 Forschungszentrum Jülich
From Testbed to ProductionFrom Testbed to Production
Success factor: vertical integration
2005 Different communities Different computing resources
(super computers, clusters, …) Know-how in Grid middleware
2002
11 Forschungszentrum Jülich
National high-performance computing centre “John von Neumann Institute for Computing”
About 650 users in 150 research projects Access via UNICORE to
IBM p690 eSeries Cluster (1312 CPUs, 8.9 TFlops) IBM BlueGene/L (2048 CPUs, 5.7 TFlops) Cray XD1 (72+ CPUs)
116 active UNICORE users 72 external, 44 internal
Resource usage (CPU-hours) Dec 18.4%, Jan 30.4%, Feb 30.5%, Mar 27.1%, Apr 29.7%,
May 39.1%, Jun 22.3%, Jul 20.2%, Aug 29.0%
ProductionProduction
12 Forschungszentrum Jülich
Grid InteroperabilityGrid InteroperabilityUNICORE – Globus Toolkit
Uniform Interface to Grid ServicesUniform Interface to Grid ServicesOGSA-based UNICORE/GS
WSRF-Interoperability
13 Forschungszentrum Jülich
Globus 2
UNICORE
TSI
GridFTP Client
Architecture: UNICORE jobs on Architecture: UNICORE jobs on GLOBUS resourcesGLOBUS resources
ClientNJS
UUDB
Uspace
IDB
GRAM Client
GRAM Job-Manager GridFTP Server
RMS
GRAM Gatekeeper
Gateway
MDS
Consortium
Research Center Jülich (project manager)
Consorzio Interuniversitario per il Calcolo Automatico dell’Italia Nord Orientale
Fujitsu Laboratories of Europe
Intel GmbH
University of Warsaw
University of Manchester
T-Systems SfR
Funded by EU grant: IST-2002-004279
Web Services
Unicore/GS ArchitectureUnicore Component
New Component
Web Services Interface
Access Unicore Components as Web Services
Integrate Web Services into the Unicore workflow
NetworkJob
Supervisor
Unicore ClientTSI
OGSA Server A
OGSA Server B
ResourceDatabase
ResourceBroker
UserDatabase
OGSA Client
UniGrids Portal
UnicoreGateway
UNICORE basic functionsSite Management (TSF/TSS)
♦ Compute Resource Factory♦ Submit, Resource Information
Job Management (JMS)♦ Start, Hold, Abort, Resume
Storage Management (SMS)♦ List directory, Copy,
Make directory,Rename, Remove
File Transfer (FTS)♦ File import, file export
StandardizationJSDL WG Revitalized by UniGrids and NAREGIAtomic Services are input to the OGSA-BES WG
Atomic Services
TSF
WSRF
TSS
WSRF
JMS
WSRF
SMS
WSRF
FTS
Three levels of interoperability
Level 1: Interoperability between WSRF services
UNICORE/GS passed the official WSRF interop test
GPE and JOGSA hosting environments succesfully tested against UNICORE/GS and other endpoints
WSRF specification will be finalized soon!♦ Currently: UNICORE/GS: WSRF 1.3, GTK: WSRF 1.2 draft 1
WSRF Hosting EnvironmentJOGSA-HE
GPE-HEGTK4-HE
UNICORE/GS-HE
WSRF Service API JOGSAGTK4UNICORE/GS GPE
Atomic ServicesCGSP
Advanced services
GPE-WorkflowUoM-Broker
GPE-Registry
GTK4UNICORE/GS
Three levels of interoperability
Level 2: Interoperability between atomic service implementations
Client API hides details about WSRF hosting environment
Client code will work with different WSRF implementations and WSRF versions if different stubs are being used at the moment!
Atomic ServicesCGSP
WSRF Hosting EnvironmentJOGSA-HE
GPE-HE
WSRF Service API
Advanced services
GTK4-HE
GPE-WorkflowUoM-Broker
GPE-Registry
UNICORE/GS-HE
GTK4
JOGSAGTK4UNICORE/GS
Atomic Service Client API GPE
GPE
UNICORE/GS
Clients Higher-level services
Portal
Visit
Apps
Expert
Atomic ServicesCGSP
WSRF Hosting EnvironmentJOGSA-HE
GPE-HE
Three levels of interoperability
Level 3: GridBeans working on top of different Client implementations
Independent of atomic service implementations
Independent of specification versions being used
GridBean run on GTK or UNICORE/GS without modifications
GridBeans survive version changes in the underlying layers and are easy to maintain
Atomic Service Client API
Clients Higher-level services
WSRF Service API
Advanced services
GTK4-HE
Portal
GPE-WorkflowUoM-Broker
GPE-Registry
GridBeans
POVRay
PDBSearch
Compiler
Gaussian
CPMD
UNICORE/GS-HE
GTK4
JOGSAGTK4UNICORE/GS
GPE
GridBean API GPE
GPE
Visit
UNICORE/GS
Apps
Expert
Forschungszentrum Jülich
20
ConsortiumConsortium
DEISA is a consortium of leading national supercomputer centers in Europe
IDRIS – CNRS, France
FZJ, Jülich, Germany
RZG, Garching, Germany
CINECA, Bologna, Italy
EPCC, Edinburgh, UK
CSC, Helsinki, Finland
SARA, Amsterdam, The Netherlands
HLRS, Stuttgart, Germany
BSC, Barcelona, Spain
LRZ, Munich, Germany
ECMWF (European Organization), Reading, UK
Granted by: European Union FP6 Grant period: May, 1st 2004 – April, 30th 2008
Forschungszentrum Jülich
21
DEISA objectivesDEISA objectives
To enable Europe’s terascale science by the integration of Europe’s most powerful supercomputing systems.
Enabling scientific discovery across a broad spectrum of science and technology is the only criterion for success
DEISA is an European Supercomputing Service built on top of existing national services.
DEISA deploys and operates a persistent, production quality, distributed, heterogeneous supercomputing environment with continental scope.
Forschungszentrum Jülich
22
Basic requirements and strategies for the Basic requirements and strategies for the DEISA research InfrastructureDEISA research Infrastructure
Fast deployment of a persistent, production quality, grid empowered supercomputing infrastructure with continental scope.
European supercomputing service built on top of existing national services requires reliability and non disruptive behavior.
User and application transparency
Top-down approach: technology choices result from the business and operational models of our virtual organization. DEISA technology choices are fully open.
Forschungszentrum Jülich
23
The DEISA supercomputing Grid: The DEISA supercomputing Grid: A layered infrastructureA layered infrastructure
Inner layer: a distributed super-cluster resulting from the deep integration of similar IBM AIX platforms at IDRIS, FZ-Jülich, RZG-Garching and CINECA (phase 1) then CSC (phase 2). It looks to external users as a single supercomputing platform.
Outer layer: a heterogeneous supercomputing Grid: IBM AIX super-cluster (IDRIS, FZJ, RZG, CINECA, CSC) close to
24 Tf BSC, IBM PowerPC Linux system, 40 Tf LRZ, Linux cluster (2.7 Tf) moving to SGI ALTIX system (33 Tf in
2006, 70 Tf in 2007) SARA, SGI ALTIX Linux cluster, 2.2 Tf ECMWF, IBM AIX system, 32 Tf HLRS, NEC SX8 vector system, close to 10 Tf
Forschungszentrum Jülich
24
Logical view of the Logical view of the phase 2 DEISA networkphase 2 DEISA network
DFN
RENATER
GARR
GÈANT
SURFnet
UKERNA
RedIRIS
FUnet
Forschungszentrum Jülich
25
AIX Super-Cluster May 2005AIX Super-Cluster May 2005
CSC
ECMWF
ServicesServices:
High performance datagrid via GPFSAccess to remote files use the fullavailable network bandwidth
Job migration across sitesUsed to load balance the global workflow when a huge partition is allocated to a DEISA project in one site
Common Production Environment
Forschungszentrum Jülich
26
Service ActivitiesService Activities SA1 – Network Operation and Support (FZJ)
Deployment and operation of a gigabit per second network infrastructure for an European distributed supercomputing platform. Network operation and optimization during project activity.
SA2 – Data Management with Global File Systems (RZG) Deployment and operation of global distributed file systems, as basic
building blocks of the “inner” super-cluster, and as a way of implementing global data management in a heterogeneous Grid.
SA3 – Resource Management (CINECA) Deployment and operation of global scheduling services for the European
super-cluster, as well as for its heterogeneous Grid extension. SA4 – Applications and User Support (IDRIS)
Enabling the adoption by the scientific community of the distributed supercomputing infrastructure, as an efficient instrument for the production of leading computational science.
SA5 – Security (SARA) Providing administration, authorization and authentication for a
heterogeneous cluster of HPC systems, with special emphasis on single sign-on.
Forschungszentrum Jülich
27
DEISA Supercomputing Grid DEISA Supercomputing Grid servicesservices
Workflow management: based on UNICORE plus further extensions and services coming from DEISA’s JRA7 and other projects (UniGrids, …)
Global data management: a well defined architecture implementing extended global file systems on heterogeneous systems, fast data transfers across sites, and hierarchical data management at a continental scale.
Co-scheduling: needed to support Grid applications running on the heterogeneous environment.
Science Gateways and portals: specific Internet interfaces to hide complex supercomputing environments from end users, and facilitate the access of new, non traditional, scientific communities.
Forschungszentrum Jülich
28
CPU GPFS CPU GPFS CPU GPFS CPU GPFS CPU GPFS
+ NRENs
ClientClient
Job
Data
Job-workflow:1) FZJ2) CINECA3) RZG4) IDRIS5) SARA
Workflow Application with UNICOREWorkflow Application with UNICOREGlobal Data Management with GPFSGlobal Data Management with GPFS
Forschungszentrum Jülich
29
Workflow Application with UNICOREWorkflow Application with UNICOREGlobal Data Management with GPFSGlobal Data Management with GPFS
30 Forschungszentrum Jülich
Usage in other ProjectsUsage in other Projects
UNICORE as basic middleware for research and development Development of UNICONDORE interoperability layer (UNICORE
CONDOR) Access to about 3000 CPUs with approx. 17 TFlops peak in the NaReGI
testbed
UNICORE is used in the Core-D-Grid Infrastructure Development of tools for (even) easier installation and configuration of
client and server components
Integration ProjectIntegration Project
31 Forschungszentrum Jülich
SummarySummary
establishes a seamless access to Grid resources and data designed as a vertically integrated Grid Middleware provides matured workflow capabilities used in production at NIC and in the DEISA infrastructure available as Open Source from
http://unicore.sourceforge.net used in research projects worldwide continuously enhanced by an international expert team of
Grid developers currently transformed in the Web Services world towards
OGSA and WSRF compliance
32 Forschungszentrum Jülich
October 11–12, 2005ETSI Headquarters, Sophia Antipolis, France
http://summit.unicore.org/2005
In conjunction withGrids@work: Middleware, Components, Users, Contest and Plugtests
http://www.etsi.org/plugtests/GRID.htm
Supported by