Supercharged PlanetLab Platform

Michael Williamson

Washington University in St Louis

Background

Internet Routing

The modern internet is a massive interconnection of millions of computers situated all around the globe. As the name suggests, any computer connected to the internet has the ability to communicate with any of the other millions of machines. How does this amazing process take place? At the backbone of the modern internet are packet switched networks. Anytime a machine wants to send a piece of information, it simply takes that data, adds on various control headers, and it transmits the resulting “packet” into whatever link it is connected to. Any link can only handle a limited amount of data, and each can be shared by any number of people.

Once a packet leaves its source machine, the internet infrastructure, through a seemingly magical process known as “routing,” ensures the data reaches the proper destination. The pieces of machinery that make this essential process possible are known as routers. From a fundamental perspective, a router is nothing more than a sophisticated direction giver. When a packet arrives, the router scans the packet’s control headers. After comparing the headers to internally stored route information, the router transmits the packet onto the appropriate link.

With internet use exploding, routers are becoming more and more important in the everyday lives of just about everyone. Thus, any technology that makes a router faster, more efficient, or more useful can be extremely valuable. At Washington University in St Louis, the goal of the people in the Applied Research Laboratory or ARL is precisely that. They are constantly trying to create the next best technology for internet routers.

Overlay Networks

When two networked computers want to communicate, they do not necessarily need to use the internet. If the two nodes are located close to one another, all they need is a wire directly connecting the two as shown in Figure 1.

On the other hand, when two computers are separated by thousands of miles, it might be extremely costly for a user to run a direct cable between the two. In this situation, using the internet is much more practical. Instead of running a several thousand mile wire, all the user of Node A has to do is connect a cable to the nearest internet connection as shown in Figure 2. Then, through the magic of internet routing, the data Node A sends is automatically delivered to Node B.

An overlay network is the combination of these two concepts. It gives the illusion of direct links between nodes over the public internet. Naturally, these “direct links” are not actual physical wires running between the machines. Instead, through various packet addressing tricks, an application on Node A has the illusion that it is directly communicating with Node B. Physically, the network topology looks like Figure 2, but logically, it looks like Figure 1.

An easy way to implement an overlay network is to use an IPV4 tunnel. A programmer assigns one logical IP address to every node within the network. Then, he or she writes a kernel module that keeps a mapping of logical IP addresses to physical ones. Whenever an application tries to send a packet to an address that belongs to the overlay network, the module encapsulates the data inside a packet that contains the correct physical address corresponding to the logical destination node. In this scenario, the application is not concerned with the physical layout of the network. Any member node could have multiple physical addresses, and the application will not care. All it needs to know are the logical addresses of the nodes it is communicating with, and the kernel module will take care of the physical addressing for it.

Recently, researchers have found an ever increasing number of applications that make use of overlay network technologies. One of the more popular uses of such a network is the PlanetLab research platform.

PlanetLab Platform

Ideally, any software developer wants to be able to test an application in a real world situation using real data. In the case of a distributed application, this is not an easy task. The developer would need to use multiple machines located at separate locations possibly thousands of miles apart connected through the internet. This sort of real world testing is essential so that the developer can see how the application reacts to factors such as real world packet delay and communication loss.



Unfortunately, most computer scientists do not have access to machines at several widely spread locations. For this reason, researchers developed and deployed the PlanetLab Platform. Essentially, PlanetLab is a network of several hundred “nodes” located at various places all over the world and connected by the public internet.

A PlanetLab “node” is simply a standard server running the PlanetLab software (a modified version of Linux). A user of the platform is allocated a “slice” on a subset of these nodes. He or she can then login to any of those nodes and run applications that are capable of communicating with any of the other nodes within that user’s subset. Recently, PlanetLab has become a useful tool for running large scale internet experiments as well as for the development of new networking protocols.

Performance Limitations

As PlanetLab has grown in popularity, so has the need for a system with better performance. The problem with the existing platform is that each node is only a standard server. The isolation between user slices is only at the application level; there is no hardware to back it up. As a result, node delay can be considerable. Take the simple example of a user trying to route traffic from Node A in Figure 4 through Node B to Node C. To do any processing on packets at Nodes B and C, the user’s application has to wait for the operating system to give it time on the CPU. Depending on the timing granularity of the operating system, this can take as long as tens of milliseconds. As the number of users and the amount of traffic using PlanetLab nodes increases, this problem is only going to get worse. High throughput and latency dependent applications will become increasingly impractical or even impossible.

Supercharged PlanetLab Platform

Introduction

The Supercharded PlanetLab Platform or SPP is the answer to the performance limitations of the original technology. The SPP is not a standard server. Although one of its components is one or more general purpose machines (also known as general processing engines or GPEs), it also consists of any number of linecards (LCs), any number of specialized networking processors (also known as network processing engines or NPEs), and acontrol processor (CP). The SPP uses all this extra hardware to allow a user to allocate resources on an NPE, also known as allocating a “fastpath”. From his or her slice running in a virtual machine on a GPE, the user can then manipulate the fastpath thus controlling the traffic flowing through it. The end result is a system that is capable of processing a user’s networking traffic with little or no intervention from the general processing machine. While the original PlanetLab platform only supports software resource isolation between users at the application level, the SPP provides this as well as hardware resource isolation. Naturally, this drastically improves the performance of the PlanetLab Platform.

Implications for Routers



Virtualization is one of the main focuses of current computer science research. The applications of this technology have proved extremely useful in industry. For example, platform virtualization involves separating an operating system from the underlying system resources. It has allowed server operators to install multiple operating systems that run concurrently in isolation on a single set of physical components. As one might expect, this makes servers significantly less expensive. In the past, one needed to buy separate physical machines to support multiple operating systems. Now, that is no longer necessary.


While the SPP can be used to improve the performance of current PlanetLab applications, that is not its intended purpose. The goal of the technology is to virtualize a router. Normally, a router resembles the box in Figure 5. Packets enter the unit on one of the physical interfaces to the left. Next, they are queued until the network processor becomes available. When it is free, the processor does a lookup into the forwarding table based on the packets’ headers. As a result of that lookup, the packet gets copied to the appropriate egress queue where it waits until it gets transmitted onto the corresponding physical interface.


Usually, there is a single router instance implementing a solitary network layer protocol per physical device. With the SPP, this is no longer true. As shown in Figure 6, a fastpath can also be thought of as a meta-router. The queues, the forwarding table, the switching fabric, and the network processor normally associated with a router are all available within a fastpath. Because the SPP supports multiple fastpaths, it also supports multiple meta-routers all running in isolation from one another. Naturally, there will only be a limited number of physical interfaces connected to the SPP, meaning that each meta-router will probably not be able to own its own set of physical links. The SPP gets around this dilemma by supporting meta-interfaces which are simply logical interfaces overlaid on the physical ones. Each fastpath can request some of the available bandwidth on any of the physical interfaces, giving a meta-interface the illusion of having its own dedicated link with a minimum bandwidth guaranteed.

In the platform world, virtualization allows multiple operating systems to run on the same physical resources. The SPP brings the same functionality to the router world. The platform allows users to use multiple, configurable routers, possibly implementing different routing protocols, on one physical device without interfering with one another and while still guaranteeing minimum bandwidth requirements.

A Look inside the SPP

Figure 7: The internal components of the SPP. The hardware components are the gray rectangles while the small ovals represent the various pieces of control software. There can be multiple NPEs, multiple GPEs, and multiple line cards, but only a single CP.

The SPP is meant to be easily reproducible at a reasonable cost. For that reason, it uses only off the shelf hardware components that can be purchased by anyone. For an in depth discussion of the actual components used, see [JT07].

What makes the platform work is the control software which allows the hardware to cooperate in ways that have never been done before. Specifically, as is shown in the yellow bubbles in Figure 5, there are 3 main software components: the Resource Management Proxy or RMP, the System Resource Manager or SRM, and the Substrate Control Daemon or SCD.

Arguably, the most important piece of software in the platform is the System Resource Manager. Responsible for all resource allocations, the SRM is the only component that has global knowledge of the system state. Because of its role, there is only a single SRM running on a single control processor in every SPP. The other software daemons, the SCD and the RMP, simply implement the mechanisms necessary to enforce the resource constraints imposed by the SRM.

On the user side of the SPP sits the RMP. Every time a slice application makes a request, the message has to pass through the RMP. As a result, an instance of the RMP runs on every General Processing Engine. When an application wants to allocate resources or collect data from its fastpath, it forwards the request to the RMP which then takes the necessary actions on the slice’s behalf. If the request deals with resources (allocating a fastpath, queues, filters, etc.), the RMP sends a message to the SRM. In the case that the request manipulates a user’s fastpath, the RMP forwards the request directly to the SCD on the NPE where the user’s fastpath resides.

The purpose of the RMP is two-fold. First, it provides a layer of security between user level applications and the rest of the system. Second, the RMP serves as a level of abstraction between users and the internal workings of the SPP. As a result, a slice does not need to know any global state information. Instead, it only worries about itself. As a simple example, consider filter management on the SPP. Normally, there are many slices allocated on any given GPE, and each may request any number of filters. The control software identifies a filter using a unique 16-bit integer. The RMP allows a slice to have a local set of filter identifiers that it automatically translates into global ID numbers. Without this level of indirection, a slice would have to deal with filters containing possibly bizarre identification numbers. More importantly, without the RMP, one of the other software components would have to provide an enforcement mechanism that would prevent the manipulation of a filter by a slice that does not own that filter.

The SCD lies on the other side of the SPP serving as the controller for the NPEs and the line cards. Normally, network processing engines are not designed to be shared. They have one set of hardware resources (usually a TCAM, SRAM, and a number of network processors) that are only meant to be used by a single application. The SCD running on every NPE allows the SPP to break this paradigm. It divides the available resources into chunks which can then be assigned to multiple user fastpaths. The SCD knows what assignments to make by communicating with the RMP and the SRM. Essentially, this gives each user the ability to “own” hardware for specialized processing, a feature that was impossible on the original PlanetLab Platform. Every line card also uses a version of the SCD. The software provides the mechanisms necessary to allow users to install filters for directing packets. Whenever a packet enters the SPP, the SCD in the line card ensures the data is forwarded to the correct location.

Capabilities of a Slice

Overview

Similar to other PlanetLab nodes, each user of the SPP is allocated a slice that runs in a virtual machine on the general processing engine. The GPE supports a standard Linux environment making application development relatively straight forward. To access the capabilities of the SPP, an application uses a Unix Domain Socket (UDS) located in the /tmp directory of the slice environment. When a slice is created, the RMP opens the UDS on behalf of the slice, and the socket remains open for the duration of the slice’s life.