26
Overview of the GLUE Project (Grid Laboratory Unified Environment) Author: Piotr Nowakowski, M.Sc. Cyfronet, Kraków GLUE H ICB H IJTB iVDGL D ataTA G GLUE

Presentation Summary

  • Upload
    flower

  • View
    73

  • Download
    0

Embed Size (px)

DESCRIPTION

Overview of the GLUE Project (Grid Laboratory Unified Environment) Author: Piotr Nowakowski, M.Sc. Cyfronet, Kraków. Presentation Summary. Goals of GLUE Key GLUE contributors GLUE schema GLUE activities Unresolved issues. Goals of GLUE. - PowerPoint PPT Presentation

Citation preview

Page 1: Presentation Summary

Overview of the GLUE Project(Grid Laboratory Unified Environment)

Author: Piotr Nowakowski, M.Sc.Cyfronet, Kraków

HICB

HIJTB

iVDGL

DataTAG

GLUE

HICB

HIJTB

iVDGL

DataTAG

GLUE

Page 2: Presentation Summary

AT, Cyfronet, June 7, 2002

Presentation Summary

• Goals of GLUE• Key GLUE contributors• GLUE schema• GLUE activities• Unresolved issues

Page 3: Presentation Summary

AT, Cyfronet, June 7, 2002

Promote coordination between European and US Grid projects Define, construct, test and deliver interoperable middleware to all Grid Projects Experiment with intercontinental Grid deployment and operational issues Establish procedures and policies regarding interoperability

Once the GLUE collaboration establishes the necessary, minimum requirements for interoperability of middleware, any future software designed by the projects covered by the umbrella of the HICB and JTB must maintain the achieved interoperability.

Goals of GLUE

Page 4: Presentation Summary

AT, Cyfronet, June 7, 2002

GLUE Organizationally

Management by iVDGL and DataTAG Guidance and oversight by the High Energy Physics Intergrid Coordination Board (HICB) and Joint Technical Board (JTB) Participating organizations (19 entities in all):

• Grid Projects (EDG, GriPhyN, CrossGrid etc.)• LHC experiments (Atlas, CMS etc.)

Page 5: Presentation Summary

AT, Cyfronet, June 7, 2002

HENP Collaboration

The HENP (High-Energy Nuclear Physics) Grid R&D projects (initially DataGrid, GriPhyN, and PPDG, as well as the national European Grid projects in UK, Italy, Netherlands and France) have agreed to coordinate their efforts to design, develop and deploy a consistent open source standards-based global Grid infrastructure.

To that effect, their common efforts are organized in three major areas:

• A HENP InterGrid Coordination Board (HICB) for high-level coordination • A Joint Technical Board (JTB)• Common Projects, and Task Forces to address needs in specific technical areas

Page 6: Presentation Summary

AT, Cyfronet, June 7, 2002

The DataTAG ProjectAim:Creation of an intercontinental Grid testbed using DataGrid (EDG) and GriPhyN components.

Work packages:

WP1: Establishment of an intercontinental testbed infrastructureWP2: High performance networkingWP3: Bulk data transfer validations and performance monitoringWP4: Interoperability between Grid domainsWP5: Information dissemination/exploitationWP6: Project management

Page 7: Presentation Summary

AT, Cyfronet, June 7, 2002

DataTAG WP4Aims: • To produce an assessment of interoperability solutions,• To provide a test environment for LHC applications to extend existing use cases to test interoperability of Grid components,• To provide input to a common Grid LHC architecture,• To plan EU-US integrated Grid deployment.

WP4 Tasks:

T4.1: Develop an intergrid resource discovery schema,T4.2: Develop intergrid Authentication, Authorization and

Accounting (AAA) mechanisms,T4.3: Plan and deploy an „intergrid VO” in collaboration with

iVDGL.

Page 8: Presentation Summary

AT, Cyfronet, June 7, 2002

DataTAG WP4Framework and Relationships

Page 9: Presentation Summary

AT, Cyfronet, June 7, 2002

The iVDGL Project(International Virtual Data Grid Laboratory)

Aim:To provide high-performance global computing infrastructure for keynote experiments in physics and astronomy (ATLAS, LIGO, SDSS etc.)

iVDGL activities:

• Establishing supercomputing sites throughout the U.S. and Europe; linking them with a multi-gigabit transatlantic link

• Establishing a Grid Operations Center (GOC) in Indiana• Maintaining close cooperation with partnership projects in

the EU and the GriPhyN project.

Page 10: Presentation Summary

AT, Cyfronet, June 7, 2002

U.S. iVDGL Network

Selected participants:

• Fermilab• Brookhaven National Laboratory• Argonne National Laboratory• Stanford LINAC Laboratory• University of Florida• University of Chicago• California Institute of Technology• Boston University• University of Wisconsin• Indiana University• Johns Hopkins University• Northwestern University• University of Texas• Pennsylvania State University• Hampton University• Salish Kootenai College

Page 11: Presentation Summary

AT, Cyfronet, June 7, 2002

iVDGL Organization Plan

Note: The GLUE effort is coordinated by the Interoperability Team (aka GLUE Team)

• Project Steering Group – advises iDVGL directors on important project decisions and issues.• Project Coordination Group – provides a forum for short-term planning and tracking of the project activities and schedules. The PCG includes representatives of related Grid projects, particularly EDT/EDG.• Facilities Team – identification of testbed sites, hardware procurement• Core Software Team – definitions of software suites and toolkits (Globus, VDT, operating systems etc.)• Operations Team – performance monitoring, networking, coordination, security etc.• Applications Team – planning the deployment of applications and the related requirements• Outreach Team – Website maintenance, planning conferences, publishing research materials etc.

Page 12: Presentation Summary

AT, Cyfronet, June 7, 2002

The GriPhyN ProjectAims:• To provide the necessary IT solutions for petabyte-scale data-intensive science by advancing the Virtual Data concept,• To create Petascale Virtual Data Grids (PVDG) to meet the computational needs of thousands of scientists spread across the globe.

GriPhyN applications:

• The CMS and ATLAS LHC experiments at CERN• LIGO (Laser Interferometer Gravitational Wave Observatory)• SDSS (Sloan Digital Sky Survey)

Timescale: 5 years (2000-2005)

Page 13: Presentation Summary

AT, Cyfronet, June 7, 2002

The Virtual Data ConceptVirtual data: the definition and delivery to a large community of a (potentially unlimited) virtual space of data products derived from experimental data. In virtual data space, requests can be satisfied via direct access and/or computation, with local and global resource management, policy, and security constraints determining the strategy used.

GriPhyN IT targets:• Virtual Data technologies: new methods of cataloging, characterizing, validating, and

archiving software components to implement virtual data manipulations• Policy-driven request planning and scheduling of networked data and computational

resources: mechanisms for representing and enforcing both local and global policy constraints and new policy-aware resource discovery techniques.

• Management of transactions and task execution across national-scale and worldwide virtual organizations: new mechanisms to meet user requirements for performance, reliability, and cost.

Page 14: Presentation Summary

AT, Cyfronet, June 7, 2002

Sample VDG Architecture

Page 15: Presentation Summary

AT, Cyfronet, June 7, 2002

Petascale Virtual Data GridsPetascale – both computationally intensive (Petaflops) and data intensive (Petabytes).Virtual – containing little ready-to-use information, instead focusing on methods of deiving this information from other data.

The Tier Concept

Developed for use by the most ambitious LHC experiments: ATLAS and CMS.

• Tier 0: CERN HQ• Tier 1: National center• Tier 2: Regional center• Tier 3: HPC center• Tier 4: Desktop PC cluster

Page 16: Presentation Summary

AT, Cyfronet, June 7, 2002

The DataGrid (EDG) ProjectAim:To enable next-generation scientific exploration which requires sharing intensive computation and analysis of shared large-scale databases, from hundreds of terabytes to petabytes, across widely distributed scientific communities.

DataGrid Work Packages:WP1: Workload ManagementWP2: Data ManagementWP3: Monitoring ServicesWP4: Fabric ManagementWP5: Storage ManagementWP6: Integration (testbeds)

WP7: NetworkWP8: Application – Particle PhysicsWP9: Application – Biomedical ImagingWP10: Application – Satellite surveysWP11: DisseminationWP12: Project Management

Page 17: Presentation Summary

AT, Cyfronet, June 7, 2002

GLUE Working Model

The following actions take place once an interoperability issue is encountered:

• The DataTAG/iVDGL managers define a plan and sub-tasks to address the relevant issue. This plan includes integrated tests and demonstrations which define overall success.• The DataTAG/iVDGL sub-task managers assemble all the input required to address the issue on hand. The HIJTB and other relevant experts would be strongly involved.• The DataTAG/iVFGL sub-task managers organize getting the work done using the identified solutions.• At appropriate points the work need is presented to the HICB, which discusses it on a technical level. Iterations take place.• At appropriate points the evolving solutions are presented to the HICB.• At an appropriate point the final solution is presented to the HICB with a recommendation that it be accepted by Grid projects.

Page 18: Presentation Summary

AT, Cyfronet, June 7, 2002

GLUE Working Model - example

Issue: DataGRID and iVDGL use different data models for publishing resource information. Therefore RBs cannot work across domains.  

• The HIJTB recognizes this and proposes it as an early topic to address. The DataTAG/iVDGL management is advised to discuss this early on.• DataTAG management has already identified this as a sub-task.• DataTAG/iVDGL employees are assigned to the problem.• Many possible solutions exist, from consolidation to translation on various levels (the information services level or even the RB level). The managers discuss the problem with clients in order to ascertain the optimal solution.• The group involved organizes its own meetings (regardless of the monthly HIJTB meetings). [this is taking place now]• A common resource model is proposed. Once it has been demonstrated to work within a limited test environment, the HIJTB/HICB will discuss if and when to deploy this generally, taking into account the ensuing modifications which will be needed to other components such as the resource broker.

Page 19: Presentation Summary

AT, Cyfronet, June 7, 2002

GLUE Schemas

GLUE schemas: descriptions of objects and attributes needed to describe Grid resources and their mutual relations.

GLUE schemas include:

• Computing Element (CE) schema – in development• Storage Element (SE) schema – TBD• Network Element (NE) Schema – TBD

The development of schemas is coordinated by JTB with collaboration from Globus, PPDG and EDG WP managers.

Page 20: Presentation Summary

CE Schemaversion 4 – 24/05/2002

• Computing Element: an entry point into a queuing system. Each queue points to one or more clusters.• Cluster: a group of subclusters or individual nodes. A cluster may be referenced by more than one computing element.• Subcluster: a homogenous group of individual computing nodes (all nodes must be represented by a predefined set of attributes).• Host: a physical computing element. No host may be part of more than one subcluster.

Page 21: Presentation Summary

AT, Cyfronet, June 7, 2002

GLUE Schema Representation

In existing MDS models, GLUE Schemas and their hierarchies can be represented through DITs (Directory Information Tree). Globus MDS v2.2 will be updated to handle the new schema.

In future OGSA-based implementations (Globus v3.0) the structure can be converted to an XML document.

Page 22: Presentation Summary

AT, Cyfronet, June 7, 2002

GLUE Stage IAims: Integration of US (iVDGL) and European (EDG) testbeds; developing a permanent set of reference tests for new releases and services.

Phase I• Cross-organizational authentication• Unified service discovery and information infrastructure• Test of Phase I infrastructure

Phase II• Data movement infrastructure• Test of Phase II infrastructure

Phase III• Community authorization services• Test of the complete service

In progress

Page 23: Presentation Summary

AT, Cyfronet, June 7, 2002

Grid Middleware and Testbed

The following middleware will be tested in Stage I of GLUE:

• EDG Work Packages WP1 (Workload management), WP2 (Data management), WP3 (Information and monitoring services), WP5 (Storage management)• GriPhyN middleware – Globus 2.0, Condor v6.3.1, VDT1.0,

The GLUE testbed will consist of:

• Computational resources: several CEs from DataTAG and iVDGL respectively.• Storage: access to mass storage systems at CERN and US Tier 1 sites.• Network: standard production networks should be sufficient.

Page 24: Presentation Summary

AT, Cyfronet, June 7, 2002

GLUE Stage I Schedule

Feb 2002: Test interoperating certificates between US and EU – doneMay 2002: Review of common resource discovery schema – in progressJun 2002: Full testbed proposal available for review.

Review of common storage schemaFirst version of common use cases (EDG WP8)Refinement of testbed proposals through HICB feedback

Jul 2002: Intercontinental resource discovery infrastructure in test mode for production deployment in September

Sep 2002: Interoperating Community and VO authorization availableImplementation of common use cases by the experiments

Nov 2002: Demonstrations plannedDec 2002: Sites integrated into Grid executing all goals of Stage I

Page 25: Presentation Summary

AT, Cyfronet, June 7, 2002

Unresolved Issues

• Ownership of GLUE schemas• Maintenance of GLUE schemas• Ownership (and maintenance) of MDS information providers

Page 26: Presentation Summary

AT, Cyfronet, June 7, 2002

Web Addresses

• GLUE Homepage at HICB: http://www.hicb.org/glue/glue.html• GLUE-Schema site: http://www.hicb.org/glue/glue-schema/schema.htm• HENP Collaboration page: http://www.hicb.org• The DataTAG Project: http://www.datatag.org• The iVDGL Project: http://www.ivdgl.org• The GriPhyN Project: http://www.griphyn.org• European DataGrid: http://www.eu-datagrid.org