Management of Real-Time Quality of Service in Distributed Systems

Management of Real-Time Quality of Service in Distributed Systems:

A Survey of Techniques

Charles Cavanaugh

CSE 6306-501

Abstract

This paper describes various techniques for managing real-time quality of service (QoS) in distributed systems. Typically, the technique involves a model of the application and QoS requirements, a runtime QoS monitoring and constraint checking system, and a mechanism for adjusting the application to improve the QoS. The approach taken here is to survey the various related research projects originating in universities and research institutes. Each project’s technique will be explained in terms of its application model, QoS monitoring and constraint checking system, and resource allocation mechanism. In addition, each technique’s shortcomings will be explained.

1 Introduction

Real-time distributed systems require monitoring and management of their quality of service (QoS). Many techniques have been proposed for doing this; there have been several projects at various universities and research institutes. There are many applications for real-time distributed systems; thus the generic term “application” will be used henceforth to refer to one or more real-time programs distributed across computers. Each technique for monitoring and management of QoS typically has a model of the application and its QoS requirements, a runtime monitoring and constraint checking system, and a mechanism for improving the QoS by changing the characteristics of the application in some way. Each technique will be explained in terms of those critical points. In addition, the disadvantages of the technique will be identified.

2 ART Real-Time Monitor

Overview

(Tokuda, Kotera, and Mercer 1989)describes an early approach to monitoring distributed real-time systems. The ART real-time monitor is a monitor for the ARTS distributed real-time OS. One important feature is that it is object-oriented, and it encapsulates timing requirements in “artobjects”. It requires kernel-level monitoring and debugging primitives.

Model

The ART kernel supports the object-oriented view of the entire distributed system. An artobject is a distributed abstract data type used for encapsulating the data, methods, and timing requirements on the methods. The artobject specification is the external interface to the object, consisting of method declarations, timing requirements for the methods, and exceptions for timing constraint violations. The artobject body consists of the method definitions including recovery methods. For a passive artobject, the body also consists of an initial process that handles requests from other objects. The object passively awaits requests from other objects. An object remotely invokes this object using a remote procedure call.

The artobject's worst-case timing requirement is called a time fence. The designer of the object must provide this and its time exception handling routine by specifying "within time except recovery-opr()". The notion of time encapsulation means keeping each object's timing error within that object. This is done by putting the timing requirement in the artobject.

Monitoring and Constraint Checking

An event tap receives the process state changes. These events are used to calculate the timing measurements for the object method invocations. The timing measurements are then checked against the time fence upon method invocation. Violations of the timing constraints raise exceptions.

Handling of Constraint Violations

A recovery operation is invoked when an exception occurs. The recovery method is flexible; the action taken to recover depends on what the designer requires. In addition, the ART Kernel has an integrated scheduler that uses the constraint checking results to determine schedulability using rate monotonic scheduling.

Conclusions

The artobject encapsulates timing requirements within each distributed object. The ART Kernel is the means of support for the constraint checking and process scheduling; thus, a special operating system (ARTS) is required. There is a recovery mechanism for constraint violations; however, there is no mechanism for dynamic adaptation. Static worst-case timing requirements are used, and timing is the only type of requirement supported.

3 ERDoS

Overview

(Chatterjee et al. 1997) describes a method of modeling applications for adaptive QoS-based management.

Model

The logical application stream model (LASM) models the application’s structure, resource requirements, and end-to-end QoS parameters. The LASM is used on startup of an application. The resource manager (RM) uses the model to initially structure the application, allocate resources to it, and schedule it on the resources. RM also uses the model to perform dynamic reallocation when needed. The LASM does not include system attributes or a user’s QoS requirements. Rather, the application invocation model (AIM) has the user’s QoS requirements. The benefit function (BF) has the QoS preferences of the user.

Below is a formal specification of the LASM:

LASM=(LUoW,LE,LC,LP)

An LUoW is a logical unit of work, which is an atomic task that transforms data and QoS. For example, it transforms data by converting compressed mpeg video to uncompressed video data. It transforms QoS because it uses resources, which affects the overall QoS of the system. An LE is a set of logical edges connecting the nodes. The LC set is the set of logical constraints. These are for mandatory and optional LUoW attributes. They specify application QoS relationships. The LP set is the set of logical parameters, which are QoS parameters that must be specified by the user at startup. The format of the LP set is as follows: {variable name, string, unit, variable type}. For example, an LP set may contain {VL, “Max. End-to-End Latency”, SEC, FLOAT}. The user will be prompted to enter the value of VL, the maximum end-to-end latency, at startup. A formal constraint appears below:

i, i0, PADi(display video)-PETi(fetch video)V.

Here, it is stated that the time from fetching the video from storage to displaying the video is to be less than or equal to the deadline V.

The authors also describe a recursive version of their LASM. In this model, the graph nodes are logical services (LS) rather than LUoW. Each LS is realized (performed) by a logical realization of service (LROS). For example, a video I/O LS has two LROS: jpeg and mpeg. Either can be used, but each LROS has advantages and disadvantages in terms of how it uses various computing and communication resources. In addition, some platforms will only be able to support one of the LROS. The LROS that is actually used is determined at runtime. The user does not need to specify the type of video compression used at the LASM layer.

The system-specific application stream model (SASM) consists of physical units of work (PUoW) that contain the LUoW and the load that the LUoW will place on the resource. The system manager program searches for the LUoW name in each resource’s PUoW set. If there is a match, the LUoW can execute on the resource.

Finally, the BF is used when the system cannot provide the required level of QoS. If this is the case, the service should degrade gracefully. The BF is multidimensional and specifies the benefit the application user receives as a function of the QoS provided. Thus, the resource management system can better decide which QoS parameters should be degraded first. An example of a BF appears below:

Benefit=bf(frame jitter, frame size), where frame jitter and frame size are QoS parameters.

Resource Management

RM takes the LASM, AIM, and BF to determine what physical resources are needed and how to schedule the application according to the QoS requirements. The RM's output goes into the SASM. The SASM has all of the system-specific and QoS requirement-specific information. The user specifies the QoS requirements at application startup.

Conclusions

In conclusion, the ERDoS project employs a collection of models that are used to model the various aspects of the applications, user QoS requirements and QoS preferences, and expected resource usage. The LASM models the application’s structure, resource requirements, and end-to-end QoS parameters. The AIM models the user’s QoS requirements. The BF models the user’s QoS preferences for graceful degradation. Finally, the SASM contains the PUoWs that enable the system manager program to assign LUoWs to the various resources. One drawback to this approach is that it assumes that there are various different algorithms for doing the same task. The system degrades performance by switching to a different algorithm or by reducing the accuracy of the computations. An option that is not considered is scaling up the application for parallel speedup or changing the allocation of resources.

4 QuO

Overview

(Loyall et al. 1998) describes a way of specifying and measuring QoS in distributed object systems. The purpose of their quality objects (QuO) project is to allow “specification of QoS contracts between client and service providers, runtime monitoring of contracts, and adaptation to changing system conditions.” This allows including of QoS in distributed object applications, since current distributed object middleware does not include QoS.

The quality description language (QDL) is used for describing states of QoS, the system elements to monitor, and what to do when the QoS state changes. A component of QDL is the contract description language (CDL), which is used for describing QoS contracts.

Model

A QuO application consists of the client, the ORB, the object, and the following additional components:

The local delegate of the remote object has the same functional interface as the remote object, but it triggers contract evaluation upon each method call and return.
The QoS contract between client and object “describes level of service desired by the client, the level of service the object expects to provide, operating regions indicating possible measured QoS, and actions to take when the level of QoS changes.”
System condition objects (sysconds) “interface between contract and resources, mechanisms, objects, and orbs in the system.” These are used for measurement of QoS and for controlling QoS.

Developing a quo application requires QoS developers to “develop QoS contracts, system condition objects, callback mechanisms, and object delegate behavior.” Quo has a framework for doing this, which consists of the following components:

QDLs for describing contracts, sysconds, and adaptive behavior of objects and delegates
Quo kernel for coordinating contract evaluation and monitoring of sysconds
Code generators for combining QDL descriptions, quo kernel code, and client code.

QDL components consist of the CDL, structure description language (SDL), and resource description language (RDL). The CDL, emphasized here, is used to describe the client-object contract, desired and expected QoS, regions of possible QoS levels, sysconds to monitor, and what to do when conditions change. The CDL description goes into the CDL code generator, which generates Java code for contract classes and instances.

Monitoring and Constraint Checking

When a QuO application makes a remote method call, eight steps happen:

The client makes the remote method call, which goes to the delegate.
The delegate evaluates the contract.
The contract retrieves the values of sysconds and determines the current region or regions.
The contract returns the current region or regions.
The delegate decides what to do based on the current regions. The default action is to pass the method call through to the remote object.
The remote object is invoked, the method is executed, and the value is returned.
The delegate evaluates the contract again, the contract retrieves the values of the sysconds, and the contract returns the current region or regions.
The delegate decides what to do with the return value. By default, it will return the value back to the client.

From steps 2, 3, and 7, it can be seen that the delegate triggers contract evaluation, getting the measurements of sysconds.

The contract consists of nested regions describing the relevant possible states of QoS in the system. Each region is defined by a predicate on the values of system condition objects. An active region means that its predicate is true. The contract determines which regions are active by looking at the value of their predicates. The list of active regions goes to the delegate.

Resource Management

The delegate decides to block, buffer, or pass through the call or return value. Blocking and buffering are done whenever QoS is low. Changes in syscond objects can trigger contract evaluation, if the contract observes those objects. The transition from one QoS region to another can trigger transition behavior. Transition behavior includes client callbacks or method calls on syscond objects. In other words, a transition can trigger behavior such as reallocation of resources.

Conclusions

In conclusion, the purpose of QuO is to allow contracts to be specified between clients and servers and to monitor those contracts and adapt the system when conditions change. QuO allows QoS to be included in distributed object applications. A problem with the QuO approach is that clients must be modified to respond to callback requests from the middleware. Another problem is that the granularity of the specification language is method-level. This fine granularity requires domain-specific knowledge. Furthermore, this approach requires one or more QoS developers who are knowledgeable in the areas of distributed object and the three languages (CDL, SDL, and RDL). These developers must design a complete system for monitoring and managing a specific application. It would be simpler if the language granularity were coarser and if only one language were needed for specification of structure, resources, and QoS requirements.

5 RT-ARM project

Overview

(Rosu et al. 1997) describes the method by which the RT-ARM project’s system adaptively allocates resources in complex real-time applications.

Model

The application is modeled using the internal application model. This model consists of a resource usage model (RUM) and an adaptation model (AM). The RUM describes the resources that an application will need and how much, as well as how it varies at runtime. The notion of component is described as a (component, event stream) tuple. The AM describes an application’s useful configurations in terms of its needs and the overhead of configuration.

There are two types of RUM: static and dynamic. The static RUM describes expected computation and communication requirements. Its parameters include parallelism level, execution time, and processor speed factor. When the application makes an explicit request for resources, it specifies the static RUM. The parameters can be estimated with algorithm analysis or code profiling. The processor speed factor describes the performance of the node used for profiling, such as the machine's FLOPS rate. The dynamic RUM contains information about how the requirements vary at runtime with respect to the static RUM. The dynamic RUM parameters are the varying parameters of the static RUM, such as execution factor. Each factor is a ratio between the static RUM and the maximum monitored value of the corresponding metric over an application-specific time interval. The adaptive resource allocation (ARA) uses the static RUM and the deadline to analyze the schedulability of programs on each resource. In addition, it uses the static RUM to reserve resources. The dynamic RUM allows the ARA to make changes to the configuration when the resource needs exceed the specified resource requirements.

Resource Management

The ARA uses the RUM to take the application’s resource requirements and determine how to meet its performance requirements. The ARA uses the AM to decide how to reallocate resources with no negotiation overhead. Before describing RT-ARM's AM, some definitions will be useful. Reconfiguration in this case means scaling an application up or down. Reconfiguration has overhead. There are two types of overhead: application-independent and application-dependent. The application-independent overhead includes the overhead for running a new replica and for reserving resources. The application-dependent overhead for starting a new application is the adaptation overhead, which is determined by the requirements of the specific components when reconfiguring an application.

The AM describes the usable configurations and the adaptation overheads for each component. An acceptable configuration specification consists of a configuration id, a static RUM, and adaptation overheads (startup and shutdown procedures). Adaptation overhead is specified by the amount of state to transfer and the execution time. Whenever an application makes an explicit request for resources, it specifies the AM.

Monitoring and Constraint Checking

The AM and dynamic RUM are used in the following way. An application specifies an AM in order to request its initial resources. The ARA configures the application based on the request and current resource availability. At runtime, components are described by current rums. The current static RUM is the RUM for the acceptable configurations after the latest allocation. The monitoring information and the current static RUM are combined to make the current dynamic RUM. When the threshold is exceeded, or the performance is close to the threshold, ARA may reallocate resources. The static rums are scaled by the corresponding dynamic RUM parameters. If the current usage is larger than its current static RUM, then the scaled static rums describe requirements that are greater than the initial specifications. The opposite applies when current usage is smaller than the current static RUM. The estimation of computational needs is done in the following way: