Project Charter
Version 2.0,
Creation date: 11-04-2006
Matteo Melani, SLAC,
Philippe Canal, Fermilab,
Revisions:
Version number / Date / Authors / Changes1.0 / 6/28/05 / Philippe Canal / Version 1.0 creation
2.0 / 4/11/06 / Matteo Melani / Version 2.0 creation
Table of Contents:
1 Introduction 3
1.1 Purpose 3
1.2 Project Goal 3
1.3 Motivation 3
2 Project Scope 4
2.1 Background 4
2.2 Accounting System Scope 5
2.3 Out-of-Scope 5
2.4 Objectives 5
2.5 Timeline and Deliverables 6
3 Project Environment 8
3.1 Assumptions 8
3.2 Constrains 8
3.3 Risks 8
3.4 Issues 8
4 Project Organization 9
4.1 Stakeholders 9
4.2 The Project Team 9
1 Introduction
1.1 Purpose
The purpose of this document is to define the scope and organization of the “OSG Accounting System” project. SLAC and Fermilab will provide the resources to the project that will be realized with the PPDG common projects and OSG management framework.
1.2 Project Goal
The OSG Accounting project goal is to deliver a software system that allows the OSG users, VOs and resource providers to track computing resource usage per user across the OSG Grid. For the first version of the system, we will not be concerned with supporting any economic or pricing model of computing resources.
1.3 Motivation
1.3.1 Resource providers’ needs
Resource providers have the need to link the consumption of their computing resources with scientific projects and experiments (represented by a Virtual Organizations) that “are” on the grid. This need is mainly driven by accountability requirements requested by the funding agencies and VOs, and by the resource providers’ desire for better resource planning and organization, improving cyber-security and eventually for supporting automatic resource allocations and consumption based on an economic model.
1.3.2 VOs’ needs
VOs use models to predict theirs computing needs for upcoming years. These models are crucial in the budget process of the VOs since they are used to obtain commitments from institutions for computing resources access. The commitments typically come in two flavors: fixed and guaranteed "quotas" of a site resources; and opportunistic access to sites resources with some educated guess as to how much access the VO should expect given the expected use by others. At the end of the fiscal year, VOs need to be able to demonstrate to what extend their models accurately predicted actual consumption and to what extend commitments from sites were actually delivered. The former requires accounting of resource utilization. The latter also requires accounting of available resources that were promised but not necessarily consumed. This is particularly important for opportunistic use.
For example assume that a VO overestimated their resource needs, and as a result, did not consume a significant fraction of the resources it was promised. In that case, it is in the interest of both sites and VO to account availability in addition to consumed resources.
For the VO to understand its own efficiency in exploiting available resources this sort of data needs to be available to the VO bookkeepers throughout the year in some form. The data generally is required to be accurate to within 10% or so, and latencies of days or a week are tolerable.
2 Project Scope
2.1 Background
2.1.1 A very simple grid model
The OSG can be seen as a distributed environment (over a WAN) where users submit requests to grid services hosted by resources providers (research labs, universities, research institutions, etc.). Each resource provider lives in a different administrative (network) domain and support one or more VOs. Typically, users do not access grid services directly but through some high-level access services (often called portals) offered by the VOs at certain specific sites.
2.1.2 Monitoring, accounting and auditing
There is often a lot of confusion around the activities of monitoring, logging, auditing and accounting in distributed systems. This is probably because these activities, and the systems that support them, often overlap making difficult to define systems' scopes and requirements.
Our view of the problem is summarized in figure 1.In the model the various components of the distributed systems write, directly or indirectly, logs messages to files. We say therefore that the logs form and Event Space that contains all the events that define the history of the systems.
We adopt the following simple view about monitoring and accounting:
A monitoring system is responsible for maintaining real-time (or quasi real-time) view of the status of the system. This functionality supports monitoring, profiling and alerting. A monitoring system mainly have to respond to the question: “What is the status of the system?”
An accounting system is in charge of linking the consumption of the system resources with users’ requests. It is main functionality is to track the consumption of system resources by users. This functionality can be used to support charging, pricing and reporting. Accounting information dependability and accuracy is a primary requirement.
2.1.3 Current status of accounting in OSG
The OSG software stack does not have an accounting system. The various monitoring and information systems (Monalisa, ACDC, and GridCat) track the utilization of same computing resources: jobs executed, job failed, CPUs hours, file transferred, etc. but often there are big discrepancies in the reported information and there is no way to link resource consumptions to grid users.
2.1.4 Other grid accounting systems
Other Grids projects have addressed the problem of accounting; some of the most interesting systems are listed in the following table.
System Name / Developed byProject/Grid / Included in the Grid middleware of Project/Grid
DGAS / EGEE
APEL / LCG / LCG
GSAX / GGF / None
RU / GGF / LCG, gLITE
RUS / GGF / None
SGAS / SweGrid / SweGrid, Globus (4.01)
GASA / Gridbus / Gridbus
AMIE + TGAccounting / TeraGrid / TeraGrid
An in depth analysis of these solutions to see if we can reuse any existing system or component will be part of this project’s activities.
2.2 Accounting System Scope
The accounting system main goals are:
· To track the computing resources consumed by a service while satisfying a users' request.
· To give to the OSG users a clear and dependable view of the resourced consumed by users
· To guarantee the security of the accounting information.
The Accounting system will be responsible for
1) Collecting accounting data from the grid services at the resource providers’ sites; considering the status of the various services the initial focus will be on collecting accounting information from Computing and Storage Element.
2) Processing and formatting the collected accounting data according to a well-defined data model. The GGF Usage Record standard will be our starting point to define a general but simple data model.
3) Storing the accounting information in databases at each site. In our model, the resource providers are the owners of the accounting data. Having each site storing its accounting data will allow them full control on who, when and how the data can be accessed. Furthermore, a distributed repository of the accounting records is the only viable solution in a highly distributed and heterogeneous grid environment like the one that the OSG consortium is building.
4) Publishing accounting information on the grid. The system must provide simple mechanism for the resource providers to publish the accounting records contained in local databases. Due to the nature of the published information authentication, authorization, confidentiality, and security in general, are major requirements.
2.3 Out-of-Scope
The OSG accounting system will not have mechanisms or interfaces for supporting any economic model. The system will not support pricing or charging. We also consider out-of-scope any mechanisms or interfaces for performing real-time resource allocations: resource consumption will be reported and calculated “after the facts” not in real-time.
2.4 Objectives
1) Compile a list of requirements on which all the stakeholders can agree.
2) Evaluate the existing grid and distributed systems accounting solutions to see what systems, components, architecture or ideas we can adopt or reuse.
3) Define and document the OSG Accounting System architecture
4) Build a prototype to familiarize with the chosen technologies, validate and explore the architecture.
5) Setup a development/integration/testing environment as needed
6) Begin the development/integration/testing process to reach the release of a solution.
7) Ensure that the OSG accounting solution is interoperable with all the grid systems run by the OSG consortium’s partners: TeraGrid, LCG, EGEE, etc.
8) Ensure the OSG accounting solution is extensible and well documented.
It is worth noticing that because the OSG, like the other grids, does not have a strictly define accounting data model, the project effort will be mainly focused in creating a dependable and extendable infrastructure for accounting. In other words, the OSG accounting solution will be able to adapt to an evolving OSG accounting data model.
Finally, the OSG Accounting solution will initially focus on gathering data for three grid services: the job execution service (Computing Element), the storage service (Storage Element) and the file transfer service.
2.5 Timeline and Deliverables
Time / DeliverablesJun-Sep. 2005 / Requirement gathering, research and analysis of the existing accounting solutions.
Requirements document version 1.0
Oct-Dec. 2005 / Architecture definition and prototyping.
Architecture overview document version 1.0
Jan-Mar. 2006 / Development/integration/testing environment definition and setup as needed.
Development/integration/testing environment document version 1.0
Apr-Sep. 2006 / Development/integration/testing of first version.
Design document version 1.0
Oct.-Nov 2006 / First release
Figure 1: Accounting, monitoring, auditing and logging model
3 Project Environment
3.1 Assumptions
None.
3.2 Constrains
In order to minimize the requirements to join OSG and to promote openness and reusability we will utilize only of open-source technologies where possible.
The OSG accounting system will be distributed through VDT. Therefore, the accounting system will conform to the VDT’s standards.
The OSG consortium aims to deploy a grid system that is interoperable with TeraGrid and LCG. Therefore, we plan to have a continuous dialogue with the TeraGrid and LCG people. We will make sure the OSG accounting solution is interoperable with the TeaGrid and LCG accounting solutions.
The OSG security infrastructure is based on GSI. The accounting system will use GSI for security.
3.3 Risks
By nature, a grid environment is highly distributed and highly heterogeneous, furthermore many of the technologies used to build it are not mature and well tested. Creating an accounting solution for such an environment is complex and difficult as many requirements and subsystems are still evolving.
The team lacks development experience with many of the Grid technologies (Web Services, GSI, Globus, Condor, Tomcat…) that most likely will be involved in the project.
The topic of accounting in distributed systems is Farley new and has never been deeply investigated in the literature.
The OSG architecture is “work in progress”. Many system and components are in development, do not have documentation or do not have clear APIs.
3.4 Issues
To have a unified view of resource consumptions across the entire OSG grid the resource providers have to agree on metrics to use to measure the consumption of computing resources.
In order to guarantee portability of accounting records across grids a standard data format must be defined. The GGF Usage Records (UR) is today the best candidate for a standard usage record. The problem is that the UR definition leaves plenty of room to define metrics and record properties making coordination between grids builders a necessity.
4 Project Organization
4.1 Stakeholders
The project stakeholders are the members OSG consortium and the OSG partners: TeraGrid, EGEE, gLITE etc. (More information about OSG members and partners can be found at www.opensciencegrid.org).
4.2 The Project Team
Matteo Melani from SLAC and Philippe Canal from Fermilab are the two chairs of the OSG Accounting Activity. They will lead the project and be responsible for the deliverables.
The following table summarizes the people working on the project.
Name / Institution/Sponsor / FTE / CommentPhilippe Canal / Fermilab CD / 0.5 / ~.1FTE until ROOT backfill available
Matteo Melani / SLAC, PPDG / 1.0
4/11/2006 Version 2.0