37
Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Embed Size (px)

Citation preview

Page 1: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Compute Grids, Data Grids and Service Grids

Dr Neil Geddes CCLRC Head of e-Science

Director of the UK Grid Operations Centre

Page 2: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Compute Grids, Data Grids and Service Grids

- What they are- What they can do- Where they can be found- What the future holds in this arena

Page 3: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Compute Grids, Data Grids and Service Grids

What are they ?

Page 4: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

4SIAC 2000, Wright State University, August 21, 2000

What is a computational grid?

• A pool of computational resources that can be “plugged into” via standard interfaces.

• Processors• Data storage devices• Instruments

Page 5: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Compute Grids

• Focus on high throughput computing– Clusters of computers

• Some very big– Clusters of clusters– HPC meta-computing– HPC + pre + post processing

• Grids enable coordination across administrative boundaries

• Key components:– Authentication, Authorisation– Resource discovery– Job submission/retrieval– Networking

NASA Information Power Grid

Page 6: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Data Grids• Focus on

– Large data volumes– Coordinated data access

• Heterogeneous and distributed data

– Importance of metadata• e.g.

– Virtual Observatories– Medical images

• Important components– Authentication, Authorisation– Resource discovery– Data transfer– Confidentiality– Networking

X-ray opticalinfra-red radio

Page 7: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Service Grids• Focus on

– Everything else: – What you want to do rather than how it is done– Integrate audio visual tools– Remote control and tele-presence

• Microscopes, Beamlines, test equipment• Integrated with compute and data grid• Integrate with other services

– Journal archives, website management• Service based architectures

– Web services• Important components

– Authentication, Authorisation– Resource discovery– Data transfer– Confidentiality– Common Interfaces

Page 8: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Common Grid Features

– Authentication – Authorisation– Accounting– Resource discovery– Data transfer– Confidentiality– Security– Automation

Different emphasis for different deployments/problems

Grid computing is about common standards/interfaces to enable inter-enterprise, collaborative computing.

Page 9: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Compute Grids, Data Grids and Service Grids

What can they do ?

Where can they be found ?

Page 10: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

(some) US Grid Projects:

• Information Power Grid (IPG) Production Grid for aerosciences and other NASA missions.

• Network for Earthquake Eng. Simulation Grid (NEESGrid) Production Grid for earthquake engineering.

• National Virtual Observatory (NVO) Production Grids for data analysis in astronomy.

• Particle Physics Data Grid (PPDG) Production Grids for data analysis in high energy and nuclear physics

• Southern California Earthquake Center 2 Full geophysics modeling using Grids and knowledge-based systems.

• TeraGrid U.S. science infrastructure linking four major resource sites at 40 Gb/s.

• DOE Science Grid (DOESG) supplies persistent Grid services.

• EdGrid promote applications of modeling and visualization in science and mathematics education, remote control of instruments (electron microscope) for K-12

• Biomedical Informatics Research Network (BIRN) An NCRR initiative aimed at creating a testbed to address biomedical researchers' need to access and analyze data at a variety of levels of aggregation located at diverse sites throughout the country.

Page 11: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

UK eScience Projects

 CLEF A Co-operative Clinical e-Science Framework

 BiosimGRID A GRID Database for biomolecular simulations

 e-HPTX An e-Science resource for High Throughput Protein Crystallography

 AstroGrid A Virtual Observatory for the UK

 BAIR Biological Atlas of Insulin Resistance

ClimatePrediction.com  Distributed computing for a global climate (NERC Pilot)

DAME  Distributed Aircraft Maintenance Environment

Page 12: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

 e-Protein A distributed pipeline for structural-based proteome annotation using GRID technology

e-Minerals  Environment from the molecular level: an e-Science proposal for modelling the atomistic processes involved in environmental issues.

Integrative Biology  A robust and fault tolerant Grid infrastructure fro biomedical science

GENIE Grid Enabled Integrated Earth system model

GEODISE  Grid Enabled Optimisation & Design Search for Engineering

myGrid   Directly Supporting the E-Scientist Comb-e-Chem Structure-Property Mapping: Combination Chemistry & the Grid

NERC DataGrid   Data discovery and delivery for the NERC community

GridPP  The Grid for UK Particle Physics

Page 14: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

e-science and the UK GRID

LHCb

ATLAS

CMS

CMS

Page 15: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

LHC Computing Grid Project

Page 16: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

climateprediction.net• Launch ensemble of coupled simulations of 1950-2000 and compare with observations. • Largest climate model ensemble ever (by factor of >200)• >45,000 users, >15,000 complete model runs, >1,000,000 model years in ~3 months (this is

equivalent to 1.5 Earth Simulators)

• Screensaver” requires – 10 CPU days on a 1.4GHz P4,>128MB memory, 600MB disk space

• Global outreach (participants in all 7 continents, inc. Antarctica!)• Generated much interest in schools (coolkidsforacoolclimate.com)

Page 17: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre
Page 18: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

http://www.nbirn.nethttp://www.nbirn.net

Page 19: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Testbed for a biomedical knowledge infrastructure Creation and support federated bioscience

databases Data integration Interoperable analysis tools Datamining software Scalable and extensible

• Driven by research needs pull, not technology push

Testbed for a biomedical knowledge infrastructure Creation and support federated bioscience

databases Data integration Interoperable analysis tools Datamining software Scalable and extensible

• Driven by research needs pull, not technology push

What is BIRN?What is BIRN?

Page 20: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

BIRN TodayBIRN Today Established three neuroscience testbeds building on

previously funded R01 research projects:- Mouse BIRN - Morph BIRN- Functional BIRN - BIRN Coordinating Center

Integrating the activities of the advanced biomedical imaging and clinical research centers in the US.

Developing hardware and software infrastructure for managing distributed data: creation of data grids.

Exploring data using “intelligent” query engines that can make inferences upon locating “interesting” data.

Building bridges across tools and data formats.

Changing the use pattern for research data from the individual laboratory/project to shared use.

Established three neuroscience testbeds building on previously funded R01 research projects:

- Mouse BIRN - Morph BIRN- Functional BIRN - BIRN Coordinating Center

Integrating the activities of the advanced biomedical imaging and clinical research centers in the US.

Developing hardware and software infrastructure for managing distributed data: creation of data grids.

Exploring data using “intelligent” query engines that can make inferences upon locating “interesting” data.

Building bridges across tools and data formats.

Changing the use pattern for research data from the individual laboratory/project to shared use.

Page 21: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

IT Infrastructure to hasten the derivation of new understanding and treatment of disease through use of distributed knowledge

IT Infrastructure to hasten the derivation of new understanding and treatment of disease through use of distributed knowledge

BIRN NetworkBIRN Network

Page 22: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Through the NEESgrid, researchers will:•perform tele-observation and tele-operation of experiments; •publish to and make use of a curated data repository using standardized markup; •access computational resources and open-source analytical tools; •access collaborative tools for experiment planning, execution, analysis, and publication.

The components of the NEESgrid system will be completed by September, 2004, when management and operation of the NEES system will be turned over to a consortium of earthquake engineer researchers and practitioners.

 Home > About

                              

                                                  

About NEESgrid will link earthquake researchers across the U.S. with leading-edge computing resources and research equipment, allowing collaborative teams (including remote participants) to plan, perform, and publish their experiments.

Page 23: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Generic Experiment in Progress (an instance or “test”)

Web Portal

SimulationPrograms

User Interface

NEES Equipment

RemoteCollaborator

EquipmentSpecialist

RemoteInvestigator

camea

Collaboratione-mail, VTC, web pages

Local Storage

video

Local Video Processor(Internet Appliance)

VideoServer

Video Server

NEESPOP

Grid Services

LabVIEWDAQ

videovideo

da

ta

Operation and ControlLines

data and video

Collaboration Services

Tele-Presence and Video Servers

(liv

e)

vid

eo

str

ea

ms

Streaming DataServer

Page 24: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Compute Grids, Data Grids and Service Grids

What the future holds ?

Page 25: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

community-initiated forum of thousands of individuals from industry and research leading the global standardization effort for grid computing.  GGF's primary objectives are to promote and support the development, deployment, and implementation of Grid technologies and applications via the creation and documentation of "best practices" - technical specifications, user experiences, and implementation guidelines.

The drive toward standardisation

•Horizontal and e-business framework •Web Services •Security •Public Sector •Vertical industry applications•WS-RF (from GGF)

OASIS is a not-for-profit, global consortium that drives the development, convergence and adoption of e-business standards

Page 26: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Enabling Grids for E-science in Europefor Everyone

Page 27: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

EGEE - Consortia

10 European Consortia (incl. GEANT/TERENA/DANTE)+ US + Russia

UK e-Science:PPARC + Core Programme

Page 28: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre
Page 29: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Oxford and Leeds (White Rose Grid)

Page 30: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Manchester and CCLRC-RAL

Page 31: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Also includes: http://www.csar.cfs.ac.uk/

256 Itanium2 processor SGI Altix512 processor Origin3800

•Thus, the NGS provides access to over 2000 processors, over 36TB of "data-grid"

capacity, common scientific applications and extensive data archives.

•Other resource providers anticipated to join in the future …

http://www.hpcx.ac.uk/Full installation = 1600 IBM p690+ Regatta processors currently 1236 processors

EMBL Nucleotide Sequences

NCBI, BLAST, EMBOSS, FASTA, Gaussian

Page 32: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

More than just computation and data resources…

In future will include services to facilitate collaborative (grid) computing•Authentication (PKI X509)•Job submission/batch service•Authorisation•Certificate management•Virtual Organisation management•Data access/integration services (SRB/OGSA-DAI/DQPS)•Information service•National Registry (of registry’s)•Data replication•Data caching•Grid monitoring•Accounting

Page 33: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

Concluding Remarks

• Huge worldwide research activity• Push towards standardisation and intersection with e-

Business (web services)• Increasing grid infrastructure deployed

‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access

to information.’

Tony Blair, 2002

Page 34: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Neil Geddes

NeSC, May 2004

The End

Page 35: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

Response of Atlantic circulation to freshwater forcing

Page 36: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

The Particle Physics Challenge

Storage – Raw recording rate 0.1 – 1 GByte/sec

Accumulating at ~10 PetaBytes/year

10 PetaBytes of disk

Processing – >100,000 of today’s fastest PCs

CMS

LHCb

ATLAS

Page 37: Neil Geddes NeSC, May 2004 Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre

CERN/LHC Community

Europe: 267 institutes, 4603 usersElsewhere: 208 institutes, 1632 users