37
Misurare il tempo di esecuzione MPI prevede delle primitive che permettono di identificare una parte di codice e misurarne il tempo di esecuzione /* Dichiarazione delle variabili*/ double starttime, endtime; ... starttime = MPI_Wtime(); /* Codice */ endtime = MPI_Wtime(); (endtime – starttime) fornisce il tempo di esecuzione in SECONDI. Se il codice da misurare è in un ciclo si può fare nel modo seguente Double s1, s2, e1, e2, acc = 0 s1 = MPI_Wtime(); For(....) { s2 = MPI_Wtime(); /* Codice1 */ e2 = MPI_Wtime(); acc +=(e2-s2); /* Codice2 */ } e1 = MPI_Wtime(); Tcodice1 è acc, Tcodice2 è e1-s1-acc

Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Misurare il tempo di esecuzioneMPI prevede delle primitive che permettono di identificare una parte di codice e misurarne il tempo di esecuzione/* Dichiarazione delle variabili*/

double starttime, endtime;

...

starttime = MPI_Wtime();

/* Codice */

endtime = MPI_Wtime();

(endtime – starttime) fornisce il tempo di esecuzione in SECONDI.

Se il codice da misurare è in un ciclo si può fare nel modo seguenteDouble s1, s2, e1, e2, acc = 0

s1 = MPI_Wtime();

For(....) {

s2 = MPI_Wtime(); /* Codice1 */ e2 = MPI_Wtime();

acc +=(e2-s2);

/* Codice2 */

} e1 = MPI_Wtime();

Tcodice1 è acc, Tcodice2 è e1-s1-acc

Page 2: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Strumenti automatici per l’analisi delle prestazioni e del comportamento di programmi MPI

MPE: MPI Parallel Environment

Main components:

• A set of routines for creating log files for examination by graphical visualization tool jumpshot4

• A shared display parallel X graphics library• A shared display parallel X graphics library

• Routines for sequentializing a section of code being executed in parallel

• Debugger setup routines

Log files format: SLOG-2 Scalable Log

Page 3: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Utilizzo di strumenti specificihttp://www.mcs.anl.gov/research/projects/perfvis/

Alcune considerazioni

A parallel program is intrinsically more complex than its serial counterpart.

To write an efficient and scalable parallel program, one must understand the behavior and performance of the program.

A powerful technique for this purpose is the visualization of trace files (also called log files).

A trace file contains several basic elements. Typically, these are generated during the execution of a program by very short code sequences (so as to minimize the perturbation of the execution caused by the tracing) and are written either to disk (buffered) or to memory as they are generated

Page 4: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Considerazioni

Tracefiles typically contain sequences of events; an event has a timestamp and some data.

A collection of events for a single process, thread, or processor is sometimes called a timeline.

Such postmortem analysis based on trace files has been an Such postmortem analysis based on trace files has been an important tool for performance analysis for parallel programs.

Many performance visualization tools display a trace file as a GANTT chart, with the x-axis representing time and the y-axis representing process or thread number.

Page 5: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Software specifico

The latest version of MPE2 that contains slog2sdk is released on 08/06/2007.

MPE2 has been shown to work with OpenMPI-1.2.3, MPICH-1.2.7, LAM/MPI-7.1.2b24, MPICH2-1.0.6, and the commercial MPI implementations on IBM's AIX and BG/L, NEC SX-8, Cray X1E, Cray XT4 (catamount) and is also reported to work with HP-MPI 2.0.1 (linux AMD64).

The new release provides thread-safe MPI logging support and friendlier enhancement of Jumpshot for threaded slog2 file. Jumpshot for threaded slog2 file.

slog2sdk verion 1.2.6 which includes SLOG-2 and Jumpshot-4 has been released on 07/12/2007.

Page 6: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Un pò di storia

Performance visualization traces its history to the original version of gist, a program that was delivered with, and ran only on, the BBN Butterfly parallel computer.

Gist is still available as part of the TotalView debugging environment, although it is now known as TimeScan.

The original version of logfile viewer, Upshot , was written to provide the most useful feature of gist, its zoomable and scrollable timeline window, in a color useful feature of gist, its zoomable and scrollable timeline window, in a color version (gist was black and white only) that was non proprietary and would run on any workstation supporting X-windows. (Upshot gets its name from gist, for which it is a synonym.)

Over the years Upshot expanded to include a number of new features. The enhancements were motivated by an application whose highly irregular and input-dependent behavior made an analysis tool of this type indispensable.

The logfile generation library developed along with Upshot was called ALOG, for Argonne logging.

Page 7: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Upshot originally was written in X (with the Athena widget set).

Because it was cumbersome to expand and maintain, we completely rewrote it in the Tcl scripting language, using the Tk companion graphics package.

This approach made for extremely rapid development, but the interpreted nature of the language made the graphics component slow when logfiles were large.

To deal with this problem, the main graphics component of Upshot were To deal with this problem, the main graphics component of Upshot were rewroten in C, using Tcl's C interface.

This rewrite, called Nupshot (for new upshot), solved the performance problem but introduced a dependency on an unstable part of Tcl itself, as the C interface changed right after the publication of this version.

Nupshot can still be used, however, if one obtains an earlier version of Tcl/Tk. Upshot, in pure Tcl/Tk, is also still available but it too requires an old version of Tcl. This instability of Tcl/Tk encouraged to find an alternative.

Page 8: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

In order to address the portability and maintainability issues faced by Upshot and Nupshot, a Java-based viewer, Jumpshot was born. Four different iterations of the Java-based viewer exist. Each addresses a different set of requirements at the time of development. For more information, see Jumpshot.

A few years ago, we rewrote the logging library to improve in multiple ways on ALOG. There was also an intermediate library called BLOG.

For historical continuity, therefore, the new package had to be called CLOG (pronounced see-log). When scalability becomes a central issue in the logfile design, SLOG ( abbreviation for Scalable LOGfile ) was developed to address the problem.

Logfile-based tools similar to ours have a rich history. Commercial tools include TimeScan, and Vampir, and academic tools that are distributed for use by others include ParaGraph, TraceView, XPVM, XMPI, and Pablo. Our visualization tool shares some features with these other tools, but it is unique in its particular combination of features.

Page 9: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Performance Visualization for Parallel Program

The software in this project, Performance Visualization for Parallel Program, is distributed with MPE, which is a software package that provides MPI ( Message Passing Interface ) programmers with a suite of Passing Interface ) programmers with a suite of performance analysis tools based on a postmortem approach.

There are two sides to this project, Log Format and Viewers. They are loosely coupled by a common file format to allow flexibility in the development of both sides of the project.

Page 10: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Logfile Format

Over the years, several logfile formats have been developed for postmortem performance visualization.

The following list has the logfile formats arranged in reverse chronological order of development. The reverse chronological order of development. The format has evolved from an event-based logging mechanism used in ALOG, BLOG, and CLOG to a state-based logging mechanism used in SLOG-1, which has further evolved to the drawable-based logging mechanism employed in SLOG-2

Page 11: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Logging Mechanism

event-based logging

Basic logging entity has only one timestamp. Every incident during the course of the program is described by one event.

state-based logging

Basic logging entity consists of two timestamps, beginning and ending timestamps. A state is described by at least two events, one for beginning event and one for ending event. Some intermediate events may take place between the beginning and ending timestamps.

drawable-based logging

Basic logging entity describes a complete graphical object. In general, a drawable is described by one or more state-like entities, plus some events if necessary

Renormalization

The concept of renormalization group transformation is often used in statistical mechanics and quantum field theory. In physics, renormalization group transformation refers to coarse-graining transformation followed by a rescaling of some system components or parameters. Since the creation of a SLOG-2 tree can be viewed a coarse-graining process, it is conceivable that rescaling can be done easily.

Page 12: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

SLOG

A scalable logfile format.

SLOG addresses the data-scalability issue of the logfile for visualization.

It is very easy to generate a large trace file for a simple parallel program. Typical graphical viewer like Jumpshot-2, which views CLOG file, requires the whole logfile to be parsed and views CLOG file, requires the whole logfile to be parsed and stored in the physical memory.

This requirement renders the performance of Jumphsot-2 slow or even useless for even moderately sized logfile.

SLOG allows the viewer to read only portion(s) of the logfile for visualization. One of the main goals of SLOG is to help users locate interesting portion(s) of the logfile for analysis.

Page 13: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

SLOG-2

A drawable-based logging format.

SLOG-2 is the reiteration of the SLOG effort. Based on the experience while developing SLOG-1, SLOG-2 addresses data-scalability issue again by a complete redesign of the logfile format

The new design employs the concept of Bounding Box, often used in graphic design, to solve the data-scalability problem.

SLOG-2 categorizes the objects in the logfile along the time-axis of the program into a binary tree of the Bounding Boxes. (The hierachical structure of SLOG-2 is not limited to binary tree, but binary tree possesses ideal structure for optimal visual to binary tree, but binary tree possesses ideal structure for optimal visual performance.)

In addition, the tree structure of SLOG-2 also allows the logfile to be renormalized to provide a better level-of-detail support at the logfile level.

The renormalized object is called preview drawable in the SLOG-2 context. These objects reduce the amount of data processed by the visualization tool at low-resolution display while still provide an overall high-level graphical description of whole logfile. The enhanced SLOG-2 format allows tight coupling between the logfile format and the zoomable graphical user interface offered by Jumpshot-4

Page 14: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Viewers

Over the years, many viewers have been developed for postmortem performance visualization.

JumpshotJumpshot

Jumpshot is a Java-based visualization tool for doing postmortem performance analysis. Using Java instead of Tcl/Tk (that was used in some of the older visualization tools) improves the portability, maintainability and functionalities of the tools. There are several iterations of the tool.

Page 15: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Jumpshot-4http://www.mcs.anl.gov/research/projects/perfvis/software/viewers/jumpshot-4/usersguide.html

Jumpshot-4 is the latest viewer and is a total redesign of the graphical tool for SLOG-2.

The new scalable logfile format allows the viewer to provide functionalities never made possible before.

Level-of-detail support through preview drawables which provides high-level abstraction of the details without reading in huge amount of data into the graphical display engine.

New Jumpshot allows seamless scrolling from the begining till the end of logfile at any zoom-level.

In addition, new functionlities like dragged-zoom, instant zoom in/out, grasp and scroll, easy vertical expansion of timeline as well manipulation of timelines are available in both Timeline and Histogram modules. The new Legend table provides a central control both Timeline and Histogram modules. The new Legend table provides a central control for both the Timeline and Histogram modules and makes manipulation of drawable categories easy. A new search/scan facility is provided to locate the hard-to-find objects in very large logfile.

Combined with special preview state display option and disabling of non-MPI states in Legend table, graphical analysis of MPI overhead in user MPI applications becomes easy.

The new viewer also provides an integrated Logfile Convertor for all the known SLOG-2 convertible trace formats, like CLOG, CLOG-2, RLOG, and UTE, and it attempts to conform to the standard Look and Feel that is expected by most users.

Page 16: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Profiling Libraries:

1. MPIProfiling interface� Accumulation of time spent in MPI routines;

2. Automatic Logging � generates file of timestamped states

3. Customized logging

1 + 2 � triggered on occurrence of MPI calls;1 + 2 � triggered on occurrence of MPI calls;

3 � triggered on occurrence of user defined states/events

Page 17: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

JUMPSHOT-4 � viewer of SLOG-2 files

• Big log files (Gigabytes);

• Support level of details � scroll up and down; zooming

• High level abstraction � preview drawable summarize real entities;

• Timeline canvas

– Timeline vs time coordination system

– Each point in the canvas is identified by two numbers: a timestamp and a timeline ID

Page 18: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Two kinds of drawable objects:

• Primitive � the simplest drawable, the basic elements of SLOG-2 files

• Categorized wrt their topology

• “state”, “arrow” and “event”

• state and arrow are identified by two points in the timeline canvas (hence two pairs):

• State start timeline ID= final timeline ID

• Arrow may have final timeline different from initial

• Event identified by only one point

• Composite is a collection of primitive or other composite drawable provides level o detail

• Preview drawable � provides high level description of what is going on.

Page 19: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Esempi - Edge detection

Ogni processo è responsabile dell’elaborazione di una porzione dei dati di input:

• Durante la fase di input i dati sono acquisiti in parallelo per blocchi di righe

• Nella fase di computazione processi “vicini” • Nella fase di computazione processi “vicini” scambiano i bordi dei dati contigui

– Tale scambio viene iterato ad ogni passo di un ciclo for presente nel codice

• Durante la fase di output i dati vengono raccolti in parallelo in un unico file

Page 20: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Edge detection

Esempio su 4 processi:

f( )f( )

f( )

f( )

Page 21: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Jumpshot-4

Page 22: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Comunicazioni� Preview arrow: rappresenta le comunicazioni

tra due preview state.

� Messaggio: indica le informazioni della singola comunicazione

Page 23: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Histogram for edge detection

Page 24: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 25: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 26: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 27: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 28: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 29: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 30: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 31: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 32: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 33: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 34: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 35: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the
Page 36: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the

Come ottenere questi dati

� Prima di tutto bisogna compilare il codice così

mpicc -c nomefile.c

mpicc -o nomefile -mpilog nomefile.o [altri .o]

Dopo l'esecuzione troverete un file � Dopo l'esecuzione troverete un file nomefile.clog

� Si lancia il tool

jumpshot edge.clog

� Se viene chiesta la conversione fatela

Page 37: Misurare il tempo di esecuzione - STAR-LAB - Homeenrico/IngegneriaDelSoftware/anno08-09/Lezione_5.pdfMisurare il tempo di esecuzione ... A powerful technique for this purpose is the