Resource Estimation in Active Networks
A Discussion of Active Networks at the Node, Routing, and System Level
Phuong T. Tran
TSYS Department of Computer Science
Columbus State University
Columbus, GA, USA
Abstract— The active network architecture provides a model in which a network can be extensible, programmable, and offer extensibility to the increasing demands of processing data in a customizable way for applications. Traditional networks cannot do this because they were only intended to transmit data from one end point to another and nodes are passive transfer stations. In contrast, active nodes would act as providers of CPU processing and bandwidth resources and be able to execute programs in packets. The major limitation of active networks is the resource demand, possible congestion, and degraded performance at active nodes when multiple applications try to access a node simultaneously. Devi et al. propose a solution to this through an estimation model and also a CPU scheduling algorithm. With their model they achieve a “Fairness” load at the nodes; however, their model only takes into account the nodes themselves and not any routing paths or the possible heterogeneity of nodes. We propose an extension of their model based on Sun et al.’s routing model for mobile networks to take into account both nodes and paths as well as a systems analysis similar to Rolia et al.’s.
I. Introduction
In the traditional network model, packets are transported from one end point to another with no modification and any computation is limited to header processing and signaling. The network nodes themselves are passive and their sole purpose is to transfer packets. Nodes do not possess the processing power to allow programmability and customization for user applications [1]. In addition, management of a traditional network is limited to the end nodes typically uses the Simple Network Management Protocol (SNMP). The drawbacks of the traditional network and SNMP include the inability to handle high volumes of processed network data and inefficiency during network congestion. Due to the central management model of SNMP commands may take time to arrive or may be lost [2].
The active network architecture model seeks to address these limitations of the current, traditional network. Its goal is to create a more extensible network through the use of “active” nodes in contrast to the “passive” nodes of traditional networks. These nodes can perform computations on the data that they are transferring, and users can also inject programs into the network. The latter allows the user to customize node processing and make it application-specific [1]. This creates an infrastructure that can be customized for application requirements by allowing for extensibility, modifications, and upgrades [2].
Naturally, active nodes also change the management of a network. By allowing active nodes, the distribution of management functions and management applications and tools can also be distributed among the nodes instead of just being managed centrally at the end nodes. For example, management rules can be relocated from management centers to programs at the nodes. Despite these advantages, active nodes also increase the complexity of managing the network. Management must now include the added functionalities of active nodes as well as the customizations injected into the active network. The traditional SNMP management was not designed to handle these complexities. The network management of an active network is another area of research, but that discussion is beyond the scope of this paper [2].
The active network architecture, however, is not without its limitations and a major issue is the problem of resource allocation. The two main resources discussed here are CPU processing and network bandwidth. When multiple applications access the actives node services simultaneously, degradation in performance can occur. In terms of security, “ill-behaved” sources can contend with and dominate other active packets for CPU processing and network bandwidth by consuming too many resources and denying them resources. This issue of resource allocation is the subject of Devi et al.'s paper. First, they address the issue of resource prediction, and secondly they address the issue of resource management by a CPU scheduling algorithm [2].
In regards to processing and consumption of resources, Devi et al. describe two types of processing: header processing and payload processing. In header processing applications, read and write operations occur in the header of a packet. Therefore the size of the packet itself usually does not affect the processing complexity. Header processing includes such services as IP forwarding, transport layer classification, and QoS routing. In payload processing, however, operations are executed over the entire packet and therefore packet size does affect processing complexity and will vary according to packet size. Examples of payload processing services include IPSec Encryption, packet compression or fusion, and packet content transcoding [2]. One of Devi et al.'s goal is to assess and estimate the resource consumption in payload processing. They note that the estimation process needs to be adaptive and take into account variations in packet size as well as processing load and operating system scheduling. Their first set of experiments address the estimation of resources. They follow this with a resource allocation method through a novel CPU scheduling algorithm, in which an estimation model can play a key role [2].
To determine an estimation model, Devi et al. evaluate three existing methods for estimating payload processing of a packet: single exponential smoothing (SES), adaptive-response rate single exponential smoothing (ARRSES), and Holt’s two-parameter estimation. ARRSES is a modified form of SES wherein the parameter, alpha, is modified in response to data patterns, but the specific mathematical basis of these methods will not be discussed here. All three methods were compared to actual CPU processing time and bandwidth. The next few figures from Devi et al. show the results that they achieved.
In Figure 1, Devi et al.'s graphs show the comparison between actual processing time (shown left) versus the SES estimate shown right). As can be seen, SES is a good estimation of CPU processing time, as determined by Devi et al. Figure 2 is similar to Figure 1, but instead of CPU processing time, it shows bandwidth comparison instead. On the left is the actual bandwidth use versus the actual use on the right.
Figure 1. Actual processing time (Left) vs. SES estimate (Right)
Figure 2. Actual bandwidth (Left) vs. SES estimate (Right)
From these results, the authors found that both SES and ARRES both adequately estimate the actual processing time and bandwidth. In regards to Holt's two-parameter equation, the authors found that this method was not adequate for estimation purposes as shown in Figure 3.
Figure 3. Holt’s two-parameter estimation
Having established an estimation method, Devi et al. next propose a CPU scheduling algorithm for an active node. The active node model they propose consists of four parts: estimation model, CPU scheduler, output scheduler, and a feedback element. Figure 4 is a diagram of this active node architecture. CPU flows into the input scheduler and the CPU picks a packet from the input queue and processes it. After processing, it sends it into the output flows to the output scheduler, which then chooses a packet in the queue for transmission. Feedback from the output scheduler informs the CPU scheduler about the output flow, and the CPU scheduler uses this information to schedule packets for processing. Note the interaction between the estimation model and the input scheduler for the CPU. Though they do not include the estimation model in their experiments, the model becomes an important, extensible part of their active node architecture. An example of an estimation model that can be incorporated there is the SES or ARRSES method that they determined in their other experiments.
Figure 4. Active node architecture
They test two algorithms: CPU scheduling with feedback and CPU scheduling without feedback. The main difference in the algorithm is the use of the Output Scheduler in the CPU scheduling with feedback. This integrates the resource allocation algorithms in the Input Scheduler and the Output Scheduler into a composite algorithm and handles the input and output flows to achieve fairness. The measure that they use for analyzing these algorithms is Fairness, which is a measure of resource usage by a flow in a certain interval of time. In an ideal system, perfect Fairness is 2/N, where N is the number of active flows in a certain time interval. Fairness in the nodes in each algorithm is compared to the perfect Fairness value. Figure 5 is a comparison of the CPU scheduling with feedback algorithm and the CPU scheduling without feedback algorithm to the ideal Fairness. As shown in the figure, the CPU scheduling with feedback shows very little deviation from ideal Fairness.
Figure 5. Fairness measure
In their conclusion, Devi et al. extend their CPU processing algorithm by suggesting the inclusion of an estimation model (as shown in their active node architecture) to analyze the delay in packets and also to estimate the processing requirements of the packets. This information can then be input into the Input Scheduler. With this scheduling algorithm based method, Devi et al. achieve near ideal fairness with the CPU scheduling with feedback algorithm. However, the authors give little basis for their choice of Fairness as a measurement and do not give evidence to support that the measure is a good one for measuring the resources used by a flow at nodes. A comparison with other algorithms and methods using their Fairness method would have made their argument more compelling. Nor did they take into account the broader network and the possibility of heterogeneity in the active nodes in terms of resource availability. Their level of analysis is limited to the individual node itself and does not take into account other nodes or the paths between nodes. The next section will discuss other methods of resource management and prediction, followed by an analysis of the problem and proposed solution and concluding with a discussion of future research.
II. Related Work
Resource management can be determined on several levels from the CPU scheduling algorithms proposed by Devi et al. in an active node, to studying the paths that data traverse between nodes, to predicting the resource demands of an entire system. This section will look at related research in resource management and prediction.
In regards to the paths between nodes and routing, Sun et al. studied routing in multi-hop, heterogeneous mobile wireless networks. In these mobile networks, nodes can vary in their quality of service (QoS) since by nature they provide service to networks that may have little infrastructure, while maintaining a high level of flexibility. An example of such a network would be a university campus wireless network. Here resources are shared and this makes resource estimation more difficult. Sun et al.'s focus is on multimedia streaming in a network that may be limited by battery life, bandwidth, node movement (and subsequent link breaks), and a distributed network with no centralized node. To maintain QoS Sun et al. proposes a routing protocol that adapts to the heterogeneity of the nodes. In their proposed model, traffic was routed according to node capabilities and mobility. For example, multimedia streaming would require more resource-powerful devices and nodes accept and forward requests based on its resource availability and the resource demand of the packet and also the mobility of the node. Nodes would check its “power” (availability) against the “power” (demand) of the packet before forwarding a packet. In addition, mobile nodes do not respond to a route request unless it is the destination. Route requests are handled by fixed nodes [3].
In addition to looking at a node or path, one can also study at the systems level a distributed network. Rolia et al. proposed a model for estimating the load in a distributed system, a model that can be useful for resource estimation of the system and in designing a distributed network. The model they propose is called Demand Estimation with Confidence. The purpose of the model is to transform a “customer’s” use of a system’s capabilities into an estimate for the corresponding demands of the service. The model also seeks to combine the usage of these capabilities into a “workload mix” in order to estimate resource demands. Their model uses a series of benchmarks based on the service, its capabilities, the system functions it uses, and the resource demands of that function to derive the “workload mix” resource demand. Their model is an alternative to traditional regression models of resource demand which had statistical limitations such as a wide confidence interval. Obviously, such demand prediction cannot be used in real time, but it is very important for system sizing and capacity planning [4].
III. Proposed Solution
Devi et al.'s active node architecture and scheduling algorithm is an elegant model at the node level. The limitation of their model or rather the scope of their paper is considering the performance of only the node and only homogeneous nodes. As Sun et al. notes, the routing paths of a network can affect performance as well as the quality of the nodes themselves, such as whether they are fixed or mobile. Moreover, the design of a system is also important, as in Rolia et al.
Since the performance of an active network depends greatly on the varied computations at the node at a time interval, Rolia et al.'s Demand Estimation with Confidence of resources could potentially be a good basis for an actual design and implementation of an active network because that estimation is focused more on the “workload mix” as a whole rather than from a statistical regression of per function demands. Active nodes are particularly susceptible to packets that would compete for resources. A study of resource demands by looking at controlled benchmarks of performance and analyzing combinations of functions would be beneficial and necessary to the functioning of an active node.
Because of the distributed and decentralized nature of a multi-hop, heterogeneous, and mobile network, an active network may actually be very beneficial to that type of network. With active nodes, management of each node would be easier because each node can share in the management functions and tools, and the decentralized nature of would not be an issue with an active network because the active node could perform such computations. The routing protocol could potentially also be modified to better route packets due to that added computation ability. On the flip side, the routing protocol described in Sun et al, would be an excellent extension of Devi et al.'s active node architecture. As necessarily limited by the scope of their paper, Devi et al. do not discuss the incoming flows to the active node. By using a routing protocol similar to Sun et al.'s the incoming flow can be controlled to limit flows to active nodes that may not be able to handle the packets. This would decrease the chance of packets completely consuming the resources of an active node at the expense of other packets.