1 P-GRADE Portal: Towards a User-friendly Grid Environment Gergely Sipos MTA SZTAKI, Hungary...

Preview:

Citation preview

1

P-GRADE Portal: P-GRADE Portal: Towards a User-friendly Towards a User-friendly

Grid EnvironmentGrid Environment

Gergely SiposGergely Sipos

MTA SZTAKI, Hungarysipos@sztaki.hu

www.lpds.sztaki.hu/pgportalpgportal@lpds.sztaki.hu

2

Technology concerns of Grid Technology concerns of Grid systemssystems

• Fast evolution of Grid systems and middleware:– GT1, GT2, OGSA, GT3 (OGSI), GT4 (WSRF),

LCG-2, gLite, …

• Many Grid systems are built based on these different technologies– EGEE (LCG-2), UK NGS (GT2), Open Science

Grid (GT3), etc.

3

Grid systems for HPC – Grid systems for HPC – User concernsUser concerns

• How to cope with the variety of Grid systems?

• How to develop/create new Grid applications?

• How to execute Grid applications?

• How to observe the application execution in the Grid?

• How to tackle performance issues?

• How to execute Grid applications over several Grids in a transparent way?

The P-GRADE Grid Portal gives the answer!

4

Properties of the Properties of the P-GRADE PortalP-GRADE Portal

• General purpose, workflow-oriented computational Grid portal. Supports the development and execution of workflow-based Grid applications.

• Support for multi-grid workflows• GridSphere based

– Easy to expand with new portlets (e.g. application-specific portlets)– Easy to tailor to end-user needs

• Grid middleware services supported by the portal:

Service LCG-2 specific grids Globus-specific grids

Job execution Computing Element GRAM

File storage Storage Element GSIFTP server

Certificate management MyProxy

Information system BDII MDS-2

Brokering Workload Management System ---

Job monitoring Mercury

Workflow & job visualization PROVE

5

What is a P-GRADE Portal workflow?What is a P-GRADE Portal workflow?

• a directed acyclic graph where– Nodes represent jobs

(executable batch programs)– Ports represent input/output

files the jobs expect/ produce– Arcs represent file transfer

between the jobs

• semantics of the workflow:– A job can be executed if all of

its input files are available • local input files: on the portal

server• remote input files: on storage

elements

6

Two levels of parallelism by a workflowTwo levels of parallelism by a workflow

• The P-GRADE Portal workflow concept enables the efficient parallelization of complex problems

• Semantics of the workflow enables two levels of parallelism:

The job can be a parallel program

– Parallel execution inside a workflow node– Parallel execution among workflow nodes

Multiple jobs can run parallel

7

25 x

10 x25 x 5 x

Forecasting dangerous weather situations (storms, fog, etc.), crucial task in the protection of life and property

Processed information:surface level measurements, high-altitude measurements, radar, satellite, lightning, results of previous computed models

Requirements:•Execution time < 10 min•High resolution (1km)

Ultra-short range weather forecast Ultra-short range weather forecast (Hungarian Meteorology Service)(Hungarian Meteorology Service)

8

The problem of current portalsThe problem of current portals

• They tightly connected and tailored to only one particular Grid (eg. NGS portal, NorduGrid portal)

• If the user wants to move to another Grid– (She has to obtain certificate for the new Grid)– She has to register for the new Grid– She has to get an account for its portal– She has to learn the new environment– She has to copy the grid files & modify the application

• P-GRADE Portal release 2.1 and above solve these problems:– (Obtain a certificate for the new Grid)– Register for the new Grid– Map some of the jobs of your workflow onto resources of this

Grid

9

EGEE Gride.g. VOCE

UK NGS

P-GRADE-Portal

London Rome

Athens

Multi-Grid P-GRADE PortalMulti-Grid P-GRADE Portal

The portal can be connected to multiple grids

Different jobs of a workflow can be executed in different grids

10

The typical P-GRADE Portal scenarioThe typical P-GRADE Portal scenarioPart 1 - development phasePart 1 - development phase

Certificate servers

Portalserver

Gridservices

OPEN EDITOR

OPEN & EDIT or DEVELOP WORKFLOW

SAVE WORKFLOW

DEFINE GRID ENVIRONMENT

11

Certificate servers

Portalserver

Gridservices

TRANSFER FILES, SUBMIT JOBS

DOWNLOAD RESULTS

DOWNLOAD RESULTS

The typical P-GRADE Portal scenarioThe typical P-GRADE Portal scenarioPart 2 - execution phasePart 2 - execution phase

VISUALIZE JOBS and

WORKFLOW PROGRESS

MONITOR JOBS

DOWNLOAD PROXY CERTIFICATES

12

Developing workflows Developing workflows with the P-GRADE with the P-GRADE

PortalPortal

1. Define the Grid environment2. Define the workflow

Main steps

13

The typical P-GRADE Portal scenarioThe typical P-GRADE Portal scenarioDevelopment phase – step 1:Development phase – step 1:

Certificate servers

Portalserver

Gridservices

DEFINE THE GRID

ENVIRONMENT

14

Resource ManagerResource Manager(settings portlet)(settings portlet)

• To define which computational resources my workflows will use

• Two levels:1. Define grids or VOs administrator

1. Name (e.g. EGrid)2. Information system (e.g. egrid-2.egrid.it)

2. Define Computational resources for each grid:1. Automatically from information system (only from MDS-2)2. Centrally by the administrator3. Individually by each user

15

Resource ManagerResource Manager(settings portlet – user view)(settings portlet – user view)

List of available gridsTo define computational

resources for such a grid

16

Resource ManagerResource Manager(settings portlet – user view)(settings portlet – user view)

Every computational resource is identified by a

• host name• port number (or use default)• local jobmanager (queue name)

e.g. egrid-3.egrid.it/jobmanager-fork

17

The typical P-GRADE Portal scenarioThe typical P-GRADE Portal scenarioDevelopment phase – step 2:Development phase – step 2:

Certificate servers

Portalserver

Gridservices

OPEN EDITOR

OPEN & EDIT or DEVELOP or IMPORT

WORKFLOW

SAVE WORKFLOW

18

Workflow developmentWorkflow developmentopening the workflow editoropening the workflow editor

The editor is a Java Webstart application

dynamic download and installation!

19

Workflow Workflow EditorEditordefining the graphdefining the graph

• The aim is to define a DAG of batch jobs:

1. Drag & drop components:jobs and ports

2. Define their properties

3. Connect ports by channels (no cycles, no loops, no conditions)

20

Workflow Workflow EditorEditordefining the jobsdefining the jobs

Define the job:•Executable file•Executable type•Number of required processors•command line params.•The resource to be used for the execution:

•Grid•(Comp. resource)

21

Which resource to use?Which resource to use?

The information system portlet

helps characterize resources!

I still don’t know which resource to

use!

22

Automatic resource selectionAutomatic resource selectionSince P-GRADE Portal v2.2Since P-GRADE Portal v2.2

1. Describe the requirements of the job

2. Select a LCG-2 middleware based Grid (e.g. VOCE) for it

3. The workflow manager will use the broker of that Grid during the execution to find the best resource for the job

23

Workflow Workflow EditorEditordefining jobs defining jobs in v2.2in v2.2

Select an LCG-2 based Grid(*_LCG_2_BROKER)!

Ignore the resource field!

Define optional requirements using

the built-in JDL editor!

24

Workflow Workflow EditorEditorJDL editor JDL editor in v2.2in v2.2

JDL look at the LCG-2 Users’ manual!

25

Workflow Workflow EditorEditordefining the portsdefining the ports

Type: input: the job requires output: the job produces

File type: local: from/to my desktop remote: from/to a storage resource

File: location of the file

Storage type: Permanent: final result of the WF Volatile: just inter-job data transfer

26

Location of filesLocation of files

• Client side location:c:\experiments\11-04.dat

• Grid Unique IDentifier (GUID):guid:1fd75fdf-dccc-4603-998b-e17facb0d034• RLS logical file name:lfn:/sipos_11_04.dat• LFC logical file name – NOT IN VOCE!

lfn:/grid/egrid/sipos/11-04.dat

Input file Output file

Local filesLocal files

Remote filesRemote files

• Client side location:result.dat

• RLS logical file name:lfn:/sipos_11_04-result.dat• LFC logical file name – NOT IN VOCE!

lfn:/grid/egrid/sipos/11-04-result.dat

27

Local vs. remote filesLocal vs. remote files

Portalserver

Gridservices

Comp. resources

Storage resources

REMOTE INPUTFILES

REMOTE OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

Only the permanent

files!

28

Workflow Workflow EditorEditorsaving the workflowsaving the workflow

Workflow has been defined!

Let’s execute it!

29

1. Download proxies2. Submit workflow3. Observe workflow progress4. If some error occurs correct the graph5. Download result

Main steps

Executing workflows Executing workflows with the P-GRADE with the P-GRADE

PortalPortal

30

The typical P-GRADE Portal scenarioThe typical P-GRADE Portal scenarioExecution phase – step 1:Execution phase – step 1:

Certificate servers

Portalserver

Gridservices

DOWNLOAD PROXY CERTIFICATES

31

Certificate ManagerCertificate Managercertificates portletcertificates portlet

• To access GSI-based Grids the portal server application needs proxy certificates

• “Certificates” portlet:

• to upload X.509 certificates into MyProxy servers

• to download short-term proxy credentials into the portal server application

32

Certificate ManagerCertificate Managerdownloading a proxydownloading a proxy

1. MyProxy server access details:• Hostname (egrid-1.egrid.it)• Port number (7512)• User name (from upload)• Password (from upload)

2. Proxy parameters:• Lifetime• Comment

33

Certificate ManagerCertificate Managerassociating the proxy with a gridassociating the proxy with a grid

This operation displays the details of the certificate and the list of available Grids

34

Certificate ManagerCertificate Managerbrowsing proxiesbrowsing proxies

Multiple proxies can be available on the portal server at the same time!

Comp. resources of SEE-GRID Comp. resources of HUNGRID

35

Certificate servers

Portalserver

Gridservices

TRANSFER FILES, SUBMIT JOBS

The typical P-GRADE Portal scenarioThe typical P-GRADE Portal scenarioExecution phase - step 2: Execution phase - step 2:

36

Workflow ManagementWorkflow Management(workflow portlet)(workflow portlet)

• The portlet presents the status, size and output of the available workflow in the “Workflow” list

• The portlet also contains the “Abort”, “Attach”, “Details”, “Delete” and “Delete all” buttons to handle execution of workflows

• It has a Quota manager to control the users’ storage space on the server

• The “Details” button gives an overview about the jobs of the workflow

• The “Attach” button opens the workflow in the Workflow Editor

37White/Red/Green color means the job is initial/running/finished state

Workflow ExecutionWorkflow Execution(observation by the workflow portlet)(observation by the workflow portlet)

38

Workflow ExecutionWorkflow Execution(observation by the workflow portlet)(observation by the workflow portlet)

White/Red/Green color means the job is initialised/running/finished

39

Workflow ExecutionWorkflow Execution

I still don’t know

what’s happening inside my workflow!

40

Certificate servers

Portalserver

Gridservices

The typical P-GRADE Portal scenarioThe typical P-GRADE Portal scenarioExecution phase – step 3:Execution phase – step 3:

VISUALIZE JOBS and

WORKFLOW PROGRESS

MONITOR JOBS

41

On-Line Monitoring both at theOn-Line Monitoring both at the workflow and job levels workflow and job levels (workflow portlet)(workflow portlet)

- The portal monitors and displays workflows

42

On-Line Monitoring both at theOn-Line Monitoring both at the workflow and job levels workflow and job levels (workflow portlet)(workflow portlet)

- The portal also monitors and visualizes parallel jobs(if they were developed with the P-GRADE Environment)

- The portal also generates a statistical view

43

Rescuing a failed workflow 1.Rescuing a failed workflow 1.(from v2.2)(from v2.2)

A job failed during workflow execution

Read the error log to know why

44

Rescuing a failed workflow 2.Rescuing a failed workflow 2.(from v2.2)(from v2.2)

Map the failed job onto a different

resource or download a new

proxy for it.

Don’t touch the finished jobs!

The execution can continue

from the point of failure!

45

Certificate servers

Portalserver

Gridservices

DOWNLOAD RESULTS

DOWNLOAD RESULTS

The typical P-GRADE Portal scenarioThe typical P-GRADE Portal scenarioExecution phase – step 5Execution phase – step 5

46

Downloading the results…Downloading the results…

47

Grid systems for HPC – Grid systems for HPC – User concernsUser concerns

• How to cope with the variety of Grid systems?

• How to develop/create new Grid applications?

• How to execute Grid applications?

• How to observe the application execution in the Grid?

• How to tackle performance issues?

• How to execute Grid applications over several Grids in a transparent way?

48

RReferenceseferences

• Official portal of– SEE-GRID infrastructure– VOCE infrastructure– HUNGRID infrastructure

• P-GRADE portal is available as service for:– Croatian Grid– UK National Grid Service– EGrid (Italy)

49

How to access P-GRADE portal? How to access P-GRADE portal?

• If you are interested in using P-GRADE Portal:– Take a look at www.lpds.sztaki.hu/pgportal

(slideshows, manuals, etc.)– Get an account for one of its production installations:

• VOCE portal - SZTAKI• SEEGRID portal – SZTAKI• HUNGrid portal – SZTAKI• NGS portal – University of Westminster

– If you are the administrator of a Globus/LCG-2 based Grid/VO then ask SZTAKI to install the P-GRADE Portal for you!

– If you know the administrator of a P-GRADE Portal you can ask him/her to give access to your Grid through his/her portal installation!

50

What more we can offerWhat more we can offer

• GEMLCA-specific P-GRADE Portal:– Share legacy applications as jobs with members of a community– Portal service for the UK NGS

www.cpc.wmin.ac.uk/ngsportal

– LCG-2 extension will be added by the end of 2005.

• Collaborative P-GRADE Portal:– Develop workflows together with your colleagues in a real-time

fashion!– Execute the different jobs with different user’s certificates– Will be available in 2006

51

Final conclusionsFinal conclusions

• P-GRADE portal provides:– Easy-to-use workflow concept for solving complex problems – Fast development of Grid applications– Integrating various components into large Grid applications:

• Sequential codes• MPI codes • (Legacy codes GEMLCA-specific P-GRADE Portal)

– Application monitoring, performance visualization, guarantee correctness– Interoperability between different Grid systems can be solved – Simultaneous execution of application components in different Grids– Easy to port applications among Grids

(Switching between Grid technologies will be transparent to the end-user)• Learn once, use everywhere• Develop once, execute anywhere

52

Thank you!Thank you!

www.lpds.sztaki.hu/pgportal

pgportal@lpds.sztaki.hu

Recommended