DataGrid

WP1 - WMS Software Administrator and User Guide

Document identifier: / DataGrid-01-TEN-0118-0_98
Date: / 03/12/2002
Work package: / WP1
Partner: / Datamat SpA
Document status
Deliverable identifier:
Abstract: This note provides the administrator and user guide for the WP1 WMS software.
IST-2000-25182 / PUBLIC / 76 / 166
/ WP1 - WMS Software Administrator and User Guide / Doc. Identifier:
DataGrid-01-TEN-0118-0_9
Date: 02/12/2002
Delivery Slip
Name / Partner / Date / Signature
From / Fabrizio Pacini / Datamat SpA / 03/12/2002
Verified by / Stefano Beco / Datamat SpA / 03/12/2002
Approved by
Document Log
Issue / Date / Comment / Author
0_0 / 21/12/2001 / First draft / Fabrizio Pacini
0_1 / 14/01/2002 / Draft / Fabrizio Pacini
0_2 / 24/01/2002 / Draft / Fabrizio Pacini
0_3 / 05/02/2002 / Draft / Fabrizio Pacini
0_4 / 15/02/2002 / Draft / Fabrizio Pacini
0_5 / 08/04/2002 / Draft / Fabrizio Pacini
0_6 / 13/05/2002 / Fabrizio Pacini
0_7 / 19/07/2002 / Fabrizio Pacini
0_8 / 16/09/2002 / Fabrizio Pacini
0_9 / 03/12/2002 / Fabrizio Pacini
Document Change Record /
Issue / Item / Reason for Change /
0_1 / General update / -  Take into account changes in the rpm generation procedure.
-  Add missing info about daemons (RB/JSS/CondorG) starting accounts
-  Some general corrections
0_2 / General Update / -  Add Cancelling and Cancel Reason information.
-  Add OUTPUTREADY job state.
-  Add new profile rpms.
-  Remove /etc/workload* shell scripts.
-  Add summary map table (user / daemon).
-  Add CEId format check.
-  Add new job cancel notification.
0_3 / General Update / -  Modified RB/JSS start-up procedure
-  Add gridmap-file users/groups issues
-  Add proxy certificate usage by daemons
-  Job attribute CEId changed to SubmitTo
-  Add DGLOG_TIMEOUT setting
-  Add workload-profile and userinterface-profile rpms
0_4 / General Update / -  Add configure option –enable-wl for system configuration files
-  Add installation checking option –with-globus for Globus to the Workload configure
-  Add new Information Index configure options
-  Remove edg-profile and edg-user-env rpms from II and UI dependencies
-  Add security configuration rpm’s for all the Certificate Authorities to UI dependencies
-  Add new parameters to RB configuration file
-  Add new Job Exit Code field to the returned job status info
-  Remove dependence from SWIG in the userinterface binary rpm
0_5 / General Update / -  Modify command options syntax (getopt-like style)
-  Add MyProxy server and client package installation/utilisation
-  Modify job cancel notification
-  Add Userguide rpm
0_6 / General Update / -  Modify configure options for the various components
-  UI commands modified to use python2 executable
-  Clarify myproxy usage
-  Explain how RB/LB addresses in the UI config file are used by the commands
-  Add –logfile option to the UI commands
0_7 / General Update / -  Modify configure options for the various components
-  Clarify UI commands –notify option usage
-  Add make test target for UI
0_8 / General Update / -  Specified dependencies of profile rpms
-  Update needed env vars for UI
-  Explain how to include default constraints in the job requirements
-  Explain that the lc field in the ReplicaCatalog address is now mandatory
-  Explain how to specify wildcards and special chars in "Arguments" in the JDL expression
0_9 / General Update / -  Defaults for Rank and Requirements in the UI config file
-  Added reference to the “.BrokerInfo” file document
-  other.CEId in Requirements vs --resource option
-  Explain MyProxy Server configuration
-  Added description of new parameters in RB configuration file
-  RB/JSS databases clean-up procedure added
-  Explain usage of RetryCount JDL attribute
-  Better explain how to specify wildcards and special chars in "Arguments" in the JDL expression
-  Updated reference to JDL Attributes note
-  Added Annex on Submission failures analysis
Files
Software Products / User files
Word 97 / DataGrid-01-TEN-0118-0_9_Document.docDatagrid_01_TEN_0118_0_8_Document
Acrobat Exchange 4.0 / DataGrid-01-TEN-0118-0_98-Document.pdf


Content

1. Introduction 8

1.1. Objectives of this document 8

1.2. Application area 8

1.3. Applicable documents and reference documents 8

1.4. Document evolution procedure 9

1.5. Terminology 9

2. Executive summary 11

3. Build Procedure 12

3.1. Required Software 12

3.2. Build Instructions 13

3.2.1. Environment Variables 13

3.2.2. Compiling the code 15

3.3. RPM Installation 25

4. Installation and Configuration 28

4.1. Logging and Bookkeeping services 28

4.1.1. Required software 28

4.1.1.1. LB local-logger 28

4.1.1.2. LB Server 28

4.1.2. RPM installation 29

4.1.3. The installation tree structure 30

4.1.3.1. LB local-logger 30

4.1.3.2. LB Server 31

4.1.4. Configuration 31

4.1.5. Environment Variables 31

4.2. RB and JSS 33

4.2.1. Required software 33

4.2.1.1. PostgreSQL installation and configuration 33

4.2.1.2. Condor-G installation and configuration 34

4.2.1.3. ClassAd installation and configuration 35

4.2.1.4. ReplicaCatalog installation and configuration 36

4.2.2. RPM installation 36

4.2.3. The Installation Tree structure 36

4.2.4. Configuration 41

4.2.4.1. RB configuration 41

4.2.4.2. JSS configuration 45

4.2.5. Environment variables 46

4.2.5.1. RB 46

4.2.5.2. JSS 46

4.3. Information Index 48

4.3.1. Required software 48

4.3.2. RPM installation 48

4.3.3. The Installation tree structure 49

4.3.4. Configuration 49

4.3.5. Environment Variables 50

4.4. User Interface 51

4.4.1. Required software 51

4.4.2. RPM installation 52

4.4.3. The tree structure 53

4.4.4. Configuration 54

4.4.5. Environment variables 56

4.5. DOCUMENTATION 57

5. Operating the System 58

5.1. LB local-logger 58

5.1.1. Starting and stopping daemons 58

5.1.2. Troubleshooting 59

5.2. LB Server 60

5.2.1. Starting and stopping daemons 60

5.2.2. Purging the LB database 61

5.2.3. Troubleshooting 61

5.3. RB and JSS 62

5.3.1. Starting PostGreSQL 62

5.3.2. Starting and stopping JSS and RB daemons 62

5.3.3. RB and JSS databases clean-up 63

5.3.4. RB troubleshooting 63

5.3.5. JSS troubleshooting 64

5.4. Information Index 64

5.4.1. Starting and stopping daemons 64

6. User Guide 65

6.1. User interface 65

6.1.1. Security 65

6.1.1.1. MyProxy 66

6.1.2. Common behaviours 69

6.1.3. Commands description 73

7. Annexes 109

7.1. JDL Attributes 109

7.2. Job Status Diagram 109

7.3. Job Event Types 111

7.4. Submission Failures Analysis 113

7.4.1. Job OutputReady 115

7.4.2. Job Cleared 118

7.4.3. Job Aborted (no matching resources - II not reachable) 122

7.4.4. Job Aborted (Standard output of job wrapper does not contain useful data) 123

7.4.5. Job Aborted (CondorG failure) 134

7.5. wildcard patterns 159

7.6. The Match Making Algorithm 161

7.6.1. Direct Job Submission 161

7.6.2. Job submission without data-access requirements 161

7.6.3. Job submission with data-access requirements 163

7.7. Process/User Mapping Table 166

1. Introduction

This document provides a guide to the building, installation and usage of the WP1 WMS software released within the DataGrid project.

1.1. Objectives of this document

Goal of this document is to describe the complete process by which the WP1 WMS software can be installed and configured on the DataGrid test-bed platforms.

Guidelines for operating the whole system and accessing provided functionalities are also provided.

1.2. Application area

Administrators can use this document as a basis for installing, configuring and operating WP1 WMS software. Users can refer to the User Guide chapter for accessing provided services through the User Interface.

1.3. Applicable documents and reference documents

Applicable documents

[A1] / Job Description Language HowTo – DataGrid-01-TEN-0102-02 – 17/12/2001
(http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-0_2.pdf)
[A2] / DATAGRID WP1 Job Submission User Interface for PM9 (revised presentation) – 23/03/2001
(http://www.infn.it/workload-grid/docs/20010320-JS-UI-datamat.pdf)
[A3] / WP1 meeting - CESNET presentation in Milan – 20-21/03/2001
(http://www.infn.it/workload-grid/docs/20010320-L_B-matyska.pdf)
[A4] / Logging and Bookkeeping Service – 0705/2001
(http://www.infn.it/workload-grid/docs/20010508-lb_draft-ruda.pdf)
[A5] / Results of Meeting on Workload Manager Components Interaction – 09/05/2001
(http://www.infn.it/workload-grid/docs/20010508-WM-Interactions-pacini.pdf)
[A6] / Resource Broker Architecture and APIs – 13/06/2001
(http://www.infn.it/workload-grid/docs/20010613-RBArch-2.doc)
[A7] / JDL Attributes - DataGrid-01-NOT-0101-0_7 – 03/12/2002
(http://www.infn.it/workload-grid/docs/DataGrid-01-NOT-0101-0_7.{doc,pdf})

Reference documents

[R1] / The Resource Broker Info file – DataGrid-01-TEN-0135-0_0
(http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0135-0_0.{doc,pdf})
IST-2000-25182 / PUBLIC / 76 / 166
/ WP1 - WMS Software Administrator and User Guide / Doc. Identifier:
DataGrid-01-TEN-0118-0_9
Date: 02/12/2002

1.4. Document evolution procedure

The content of this document will be subjected to modification according to the following events:

·  Comments received from Datagrid project members,

·  Changes/evolutions/additions to the WMS components.

1.5. Terminology

Definitions

Condor / Condor is a High Throughput Computing (HTC) environment that can manage very large collections of distributively owned workstations
Globus / The Globus Toolkit is a set of software tools and libraries aimed at the building of computational grids and grid-based applications.

Glossary

class-ad / Classified advertisement
CE / Computing Element
DB / Data Base
FQDN / Fully Qualified Domain Name
GDMP / Grid Data Management Pilot Project
GIS / Grid Information Service, aka MDS
GSI / Grid Security Infrastructure
job-ad / Class-ad describing a job
JDL / Job Description Language
JSS / Job Submission Service
LB / Logging and Bookkeeping Service
LRMS / Local Resource Management System
MDS / Metacomputing Directory Service, aka GIS
MPI / Message Passing Interface
PID / Process Identifier
PM / Project Month
RB / Resource Broker
RC / Replica Catalogue
SE / Storage Element
SI00 / Spec Int 2000
SMP / Symmetric Multi Processor
TBC / To Be Confirmed
TBD / To Be Defined
UI / User Interface
UID / User Identifier
WMS / Workload Management System
WP / Work Package

2. Executive summary

This document comprises the following main sections:

Section 3: Build Procedure

Outlines the software required to build the system and the actual process for building it and generating rpms for the WMS components; a step-by-step guide is included.

Section 4: Installation and Configuration

Describes changes that need to be made to the environment and the steps to be performed for installing the WMS software on the test-bed target platforms. The resulting installation tree structure is detailed for each system component.

Section 5: Operating the System

Provides actual procedures for starting/stopping WMS components processes and utilities.

Section 6: User Guide

Describes in a Unix man pages style all User Interface component commands allowing the user to access WMS provided services.

Section 7: Annexes

Deepens arguments introduced in the User Guide section that are considered useful for the user to better understand system behaviour.

3. Build Procedure

In the following section we give detailed instructions for the installation of the WP1 WMS software package. We provide a source code distribution as well as a binary distribution and explain installation procedures for both cases.

3.1. Required Software

The WP1 software runs and has been tested on platforms running Globus Toolkit 2.0 Beta Release 21 on top of Linux RedHat 6.2.

Hereafter are listed the software packages, apart from WP1 software version 1.0, that are required to be installed locally on a given site in order to be able to build the WP1 WMS on it. They are:

-  Globus Toolkit 2.0 Beta 21 or higher (download at http://datagrid.in2p3.fr/distribution/globus/beta-21)

-  Python 2.1.1 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

-  Swig 1.3.9 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

-  Expat 1.95.1 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

-  Expat-devel 1.95.1 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

-  MySQL Version 9.38 Distribution 3.22.32, for pc-linux-gnu (i686) (download at http://datagrid.in2p3.fr/distribution/config/external_services.html)

-  MySQL Version 11.15 Distribution 3.23.42, for pc-linux-gnu (i686)

(download at http://datagrid.in2p3.fr/distribution/external/RPMS/). Hereafter the needed rpms:

MySQL-shared-3.23.42-1

MySQL-client-3.23.42-1

MySQL-3.23.42-1

MySQL-devel-3.23.42-1

-  Postgresql 7.1.3 (http://datagrid.in2p3.fr/distribution/config/external_services.html)

-  Classads library (download at http://datagrid.in2p3.fr/distribution/external/RPMS/classads-0.0-edg2.i386.rpm)

-  CondorG 6.3.1 for INTEL-LINUX-GLIBC21 (download at

http://datagrid.in2p3.fr/distribution/external/RPMS/CondorG-6.3.1-edg5.i386.rpm)

-  Perl IO Stty 0.02, Perl IO Tty 0.04 (download at http://datagrid.in2p3.fr/distribution/config/external.html )

-  MyProxy-0.4.4 (download at http://datagrid.in2p3.fr/distribution/external/RPMS/). Hereafter the needed rpms:

myproxy-server-0.4.4-edg6.i386.rpm (for the MyProxy Server machine)

myproxy-client-0.4.4-edg6.i386.rpm (for the UI machine)

-  Perl 5 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

-  gcc version 2.95.2

-  GNU make version 3.78.1 or higher

-  GNU autoconf version 2.13

-  GNU libtool 1.3.5

-  GNU automake 1.4

-  GNU m4 1.4 or higher

-  RPM 3.0.5

-  sendmail 8.11.6

3.2. Build Instructions

The following instructions deal with the building of the WMS software and hence apply to the source code distribution.

3.2.1. Environment Variables

Before starting the compilation, some environment variables related to the WMS components can be set or configured by means of the configure script. This is needed only if package defaults are not suitable. Involved variables are listed below:

-  GLOBUS_LOCATION base directory of the Globus installation

The default path is /opt/globus.

-  MYSQL_INSTALL_PATH base directory of the MySQL installation

The default path is /usr.

-  EXPAT_INSTALL_PATH base directory of the Expat installation.

The default path is /usr.

-  GDMP_INSTALL_PATH base directory of the Gdmp installation

The default path is /opt/edg.

-  PGSQL_INSTALL_PATH base directory of the Pgsql installation.

The default path is /usr.

-  CLASSAD_INSTALL_PATH base directory of the Classad library installation. The

default path is /opt/classads.

-  CONDORG_INSTALL_PATH base directory of the Condor installation.

The default path is /opt/CondorG.

-  PYTHON_INSTALL_PATH base directory of the Python installation.

The default path is /usr.

-  SWIG_INSTALL_PATH base directory of the Swig installation .

The default path is /usr/local.

-  MYPROXY_INSTALL_PATH base directory of the MyProxy installation .

The default path is /usr/local.

In order to build the whole WP1 package, all the environment variables in the previous list must be set. Instead for building the User Interface module, the environment variables that need to be set are the following:

-  GLOBUS_LOCATION

-  CLASSAD_INSTALL_PATH

-  PYTHON_INSTALL_PATH

-  SWIG_INSTALL_PATH

-  EXPAT_INSTALL_PATH

If you plan to build the Job Submission and Resource Broker module, variable to set are:

-  GLOBUS_LOCATION

-  MYSQL_INSTALL_PATH

-  EXPAT_INSTALL_PATH

-  GDMP_INSTALL_PATH

-  PGSQL_INSTALL_PATH

-  CLASSAD_INSTALL_PATH

-  CONDORG_INSTALL_PATH

If you plan to build the Proxy module, variables to set are:

-  GLOBUS_LOCATION

-  MYPROXY_INSTALL_PATH

Whilst the LB server and Local Logger modules, to be built need the following environment variables:

-  GLOBUS_LOCATION

-  MYSQL_INSTALL_PATH

-  EXPAT_INSTALL_PATH

Finally, the LB library module needs:

-  GLOBUS_LOCATION

-  EXPAT_INSTALL_PATH

and the Information Index module only:

-  GLOBUS_LOCATION

3.2.2. Compiling the code

After having unpacked the WP1 source distribution tar file, or having downloaded the code directly from the CVS repository, change your working directory to be the WP1 base directory, i.e. the Workload directory, and run the following command: