Workflow Support for Complex Grid Applications: Integrated and Portal Solutions

P-GRADE: a graphical environment to create and execute workflows in various Grids[1]

G. Dózsa, P. Kacsuk, R. Lovas, N. Podhorszki, D. Drótos

MTA SZTAKI Laboratory of Parallel and Distributed Systems,
H-1518 Budapest P.O.Box 63, Hungary

{rlovas | dozsa | kacsuk | pnorbert | drotos}@sztaki.hu

Abstract. In this paper we present the P-GRADE graphical Grid application development environment that graphically supports the design, execution, monitoring, and performance visualisation of workflow grid applications. The described workflow concept can provide interoperability among different types of legacy applications on heterogeneous computational platforms, such as Condor or Globus based grids. The major design and implementation issues concerning the integration of Condor/Condor-G/DAGman tools, Mercury/GRM grid monitoring infrastructure, PROVE performance visualisation tool, and the new high-level workflow editor and manager of P-GRADE development environment are discussed in the case of the integrated and the portal version as well. The integrated version of P-GRADE represents the thick client concept, while the portal version needs only a thin client and can be accessed by a standard web browser. To illustrate the application of our approach in the grid, an ultra-short range weather prediction system is presented that can be executed on a Condor-G/Globus based testbed and its execution can also be visualised not only at workflow level but at the level of individual jobs, too.

1 Introduction

The workflow concept is a widely accepted approach to compose large scale applications by connecting programs into an interoperating set of jobs in the Grid [6][12][17][19][20].

Our main aim was to develop a workflow solution for complex grid applications to support the design, execution, monitoring, and performance visualisation phases of development in a user-friendly way. In the presented approach the interoperability among different types of legacy applications executed on heterogeneous platforms, such as Condor or Globus based computational grids, is the particularly addressed issue beside the efficient monitoring and visualisation facilities in the grid.

Several achievements of different grid-related projects have been exploited in the presented work to hide the low-level details of heterogeneous software components as well as to provide a unified view for application developers. These targets are crucial for the successful utilisation of grid environments by users from other scientific areas, such as physics, chemists, or meteorology. The design and implementation issues concerning the integration of Condor/Condor-G/DAGman tools [1][2][20], Mercury/GRM grid monitoring infrastructure [3], PROVE performance visualisation tool [4], and the new high-level workflow editor and manager layer of P-GRADE programming environment [10] are discussed in Section 2 and Section 3.

As the main result a new extension of P-GRADE graphical programming environment was developed; the integrated workflow support enables construction, execution, and monitoring of complex applications on both Condor and Globus based grids (see Section 2 and Section 3). The portal version of the workflow layer offers similar facilities via web interface to the integrated version but the occasionally slow and unreliable network connection must be taken into consideration more rigorously during the separation of client and server side functionalities. (see Section 4).

To illustrate the application of our approach in the grid, an ultra-short range weather prediction system is presented that can be executed on a Condor-G/Globus based testbed and visualised the execution not only at workflow level but at the level of individual jobs, too.

2 Component Based Grid Programming by Workflow

The presented workflow connects existing sequential or parallel programs into an interoperating set of jobs. Connections define dependency relations among the components of the workflow with respect to their execution order that can naturally be represented as graphs. Such representation of a meteorological application is depicted in Fig. 1. Nodes (labelled as delta, visib, etc. in Fig. 1.) represent different jobs from the following four types: sequential, PVM, MPI, or GRAPNEL job (generated by P-GRADE programming environment).

Small rectangles (labelled by numbers) around nodes represent data files (dark grey ones are input files, light grey ones are output files) of the corresponding job, and directed arcs interconnect pairs of input and output files if an output file serves as input for another job. In other words, arcs denote the necessary file transfers between jobs.

Therefore, the workflow describes both the control-flow and the data-flow of the application. A job can be started when all the necessary input files are available and transferred by GridFTP to the site where the job is allocated for execution. Managing the file-transfers and recognition of the availability of the necessary files is the task of our workflow manager that extends the Condor DAGMan capabilities.

For illustration purpose we use a meteorological application [5] called MEANDER developed by the Hungarian Meteorological Service. The main aim of MEANDER is to analyse and predict in the ultra short-range (up to 6 hours) those weather phenomena, which might be dangerous for life and property. Typically such events are snowstorms, freezing rain, fog, convective storms, wind gusts, hail storms and flash floods. The complete MEANDER package consists of more than ten different algorithms from which we have selected four ones to compose a workflow application for demonstration purpose. Each calculation algorithm is computation intensive and implemented as a parallel program containing C/C++ and FORTRAN sequential code.

Fig. 1. Workflow representation of MEANDER meteorological application
and the underlying design layers of P-GRADE parallel programming environment

The first graph depicted in Fig. 1 (see Workflow Layer) consists of four jobs (nodes) corresponding four different parallel algorithms of the MEANDER ultra-short range weather prediction package and a sequential visualisation job that collects the final results and presents them to the user as a kind of meteorological map:

· Delta: a P-GRADE/GRAPNEL program compiled as a PVM program with 25 processes

· Cummu: a PVM application with 10 processes

· Visib: a P-GRADE/GRAPNEL program compiled as an MPI program with 20 worker processes (see the Application window with the process farm and the master process in Fig. 1.)

· Satel: an MPI program with 5 processes

· Ready: a sequential C program

This distinction among job types is necessary because the job manager on the selected grid site should be able to support the corresponding parallel execution mode, and the workflow manager is responsible for handling of various job types by generating the appropriate submit files.

Generally, the executables of the jobs can be existing legacy applications or can be developed by P-GRADE. A GRAPNEL job can be translated into either a PVM or an MPI job but it should be distinguished from the other types of parallel jobs since P-GRADE provides fully interactive development support for GRAPNEL jobs; for designing, debugging, performance evaluation and testing the parallel code [10]. By simply clicking on such a node of the workflow graph P-GRADE invokes the Application window in which the inter-process communication topology of the GRAPNEL job can be defined and modified graphically [23] (see Fig. 1. Application window) using similar notations than that at workflow level. Then, from this Application window the lower design layers, such as the Process and the Text levels, are also accessible by the user to change the graphically or the textually described program code of the current parallel algorithm (see the Process and Text window of visibility calculation in Fig. 1.). It means that the introduced workflow represents a new P-GRADE layer on the top of its three previously existing hierarchical design layers [13].

Besides the type of the job and the name of the executable (see Fig. 1), the user can specify the necessary arguments and the hardware/software requirements (architecture, operating system, minimal memory and disk size, number of processors, etc.) for each job. To specify the resource requirements, the application developer can currently use either the Condor resource specification syntax and semantics for Condor based grids or the explicit declaration of grid site where the job is to be executed for Globus based grids (see Fig. 1. Job Attributes window, Requirement field).

In order to define the necessary file operations (see Fig. 1) of the workflow execution, the user should define the attributes of the file symbols (ports of the workflow graph and the File I/O attributes window as shown in Fig. 1) and file transfer channels (arcs of the workflow graph). The main attributes of the file symbols in the File I/O attributes window are as follows:

· file name

· type

The type can be permanent or temporary. Permanent files should be preserved during the workflow execution but temporary files can be removed immediately when the job using it (as input file) has been finished. It is the task of the workflow manager to transfer the input files to the selected site where the corresponding job will run. The transfer can be done in two ways. The off-line transfer mode means that the whole file should be transferred to the site before the job is started. The on-line transfer mode enables the producer job and the consumer job of the file to run in parallel. When a part of the file is produced the workflow manager will transfer it to the consumer's site. However, this working mode obviously assumes a restricted usage of the file both at the producer and consumer site and hence, it should be specified by the user that the producer and consumer meet these special conditions. In the current implementation only the off-line transfer mode is supported.

3 Execution and Monitoring of Workflow

Two different scenarios can be distinguished according to the underlying grid infrastructure:

· Condor-G/Globus based grid

· Pure Condor based grid

In this section we describe the more complex Condor-G/Globus scenario in details but the major differences concerning the pure Condor support are also pointed out.

The execution of the designed workflow is a generalisation of the Condor job mode of P-GRADE [9]; but to execute the workflow in grid we utilise the Condor-G and DAGMan tools [1][2] to schedule and control the execution of the workflow on Globus resources by generating

· a Condor submit file for each node of the workflow graph

· a DAGman input file that contains the following information:

1 List of jobs of the workflow (associating the jobs with their submit files)

2 Execution order of jobs in textual form as relations

3 The number of re-executions for each job's abort

4 Tasks to be executed before starting a job and after finishing the job (implemented in PRE and POST scripts).

The extension of the Condor DAGMan mechanism is realized by the PRE and POST scripts of the P-GRADE workflow system. These scripts are generated automatically from the P-GRADE workflow description and they realise the necessary input and output file transfer operations between jobs. In the current implementation GridFTP commands [19] are applied to deliver the input and output files between grid sites in a secure way (in the pure Condor scenario it can be done by simple file operations). These scripts are also responsible for the detection of successful file transfers, since a job can be started only if its all input files are already available. In order to improve the efficiency the data files are transferred in parallel if the same output file serves as an input file of more than one jobs.

Additionally, before the execution of each job a new instance of GRM monitor [3] is launched and attached (via a subscription protocol) to Mercury main monitor [4] located at the grid site where the current job will be executed. In order to visualise the trace information, collected on jobs by the GRM/Mercury monitor infrastructure, PROVE performance visualisation tool [4] is used (see Fig. 2.). Furthermore, these scripts also generate a PROVE-compliant tracefile for the whole workflow including events regarding the start/finish of job as well as file transfers.

The actual execution of the workflow can be automatically started from P-GRADE. If the P-GRADE system is running on a Condor pool, the command is immediately interpreted and executed by Condor. If the P-GRADE submit machine is not in a Condor pool, the following extra operations are supported by P-GRADE:

1. A remote Condor pool should be selected by the user via the Mapping Window of P-GRADE.

2. All the necessary files (executables, input files, DAGman input file, Condor submit files) are transferred automatically to a machine of the selected Condor pool.

3. The "condor_submit_dag" command is automatically called in the selected Condor pool.

4. After finishing the workflow execution, the necessary files are automatically transferred back to the P-GRADE client machine.

5. The on-line visualisation with PROVE can be performed locally in the client machine.

Notice that currently we use Condor DAGman as the base of workflow engine. However, in the near future we are going to create a general Grid Workflow Manager that takes care of possible optimisations concerning the selection of computing sites and file resources in the grid, controlling the migration of jobs of the workflow among different grid resources, handling the user’s control request during the execution, etc.

During the execution, job status information (like submitted, idle, running, finished) of each component job is reflected by different colour of the corresponding node in the graph, i.e., the progress of the whole workflow is animated within the workflow editor window.

Fig. 2. Space-time diagram of the whole workflow and one of its component jobs

PROVE visualisation tool provides much more detailed view of the progress of the whole workflow and each component job than that shown by the status animation within the workflow editor. PROVE co-operates with the Mercury/GRM grid monitor system (developed within the EU GridLab project [15]) to collect the trace events. These events can be generated by any of the component jobs running on any of the grid resources provided that the corresponding programs are instrumented and linked against the GRM library, and Mercury is installed on each grid site. Having accessed the appropriate trace events, PROVE displays them on-line in separated space-time diagrams for each component job. An overall view on the progress of workflow execution is also displayed as the same kind of space-time diagram. Fig. 2 depicts the space-time diagram of our workflow-based meteorological application and one of its parallel component job cummu.