O'reilly Media Template for Microsoft Word

Open Source Networking Solutions

“99% of the people who reject using the software until it gets open sourced will never even look at its source code when it’s done.”

“Most of the people are not planning to use airbags in cars, but they want them anyway.”

From the conversation between Yakov and Murat.

Introduction

The selection of a communication protocol can be as crucial for the success of your RIA as a professionally designed UI. LiveCycle Data Services (LCDS) is an excellent solution for building enterprise grade scalable RIAs, but some enterprises just don’t have the budget for it. Many smaller IT organizations still use the more familiar HTTP or SOAP Web Services because it’s an easy route into the world of RIA with only minor changes on the back end.

Now there’s a faster, more powerful open-source option: In February 2008, Adobe released BlazeDS in conjunction with open sourcing the specification of the Action Message Format (AMF) communication protocol. Offering many of the same capabilities as LCDS, BlazeDS is a Java-based open source implementation of the AMF, which sends the data over the wire in a highlycompressed binary form.

Large distributed applications greatly benefit by working with the strongly typed data. Sooner or later developers will need to refactor the code, and if there were no data type information available, changing the code in one place may break the code in another and the compiler may not help you in identifying such newly introduced bugs.

This chapter will unleash the power of AMF and provide illustrations of how to create a robust platform for development of modern RIA without paying hefty licensing fees. It will discuss polling and server-side push techniques for the client-server communications, as well as how to extend the capabilities of BlazeDS to bring it closer to LCDS.

BlazeDS vs. LCDS

Prior to Adobe’s BlazeDS, Flex developers who wanted to use AMF protocol to speed up the data communication between Flex and the server side of their application had to select one of the third-party libraries such as Open AMF, WebOrb, or Granite DS. The release of the open-source BlazeDS, however, brought a lot more than just support of AMF. You can think of BlazeDS as a scaled down version of LCDS. As opposed to LCDS, BlazeDS doesn’t support RTMP protocol, Data Management Services, PDF Generation and has limited scalability. But even with these limitations, its AMF support, ability to communicate with Plain Old Java Objects (POJO), and support of messaging via integration with the Java Messaging Protocol makes BlazeDS is a highly competitive player in the world of RIA. These features alone make it a good choice for architechting RIA data communication comparing to any AJAX library or a package that just implements AMF protocol.

Figure 6-1 provides a capsule comparison of BlazeDS and LiveCycle functions. The items shown in grey represent the features available only in LCDS. The features of BlazeDS are highlighted in black.

Figure 6-1. Comparing functionality of BlazeDS and LCDS

One limitation of BlazeDS is that its publish-subscribe messaging is implemented over HTTP usinglong-running connections rather than supporting RTMP as in LCDS. Under the HTTP approach, the client opens a connection with the server, which allocates a thread that holds this connection on the server. The server thread gets the data and flushes them down to the client but then continues to hold the connection.

You can see the limit right there: because creating each thread has some overhead, the server can hold only a limited number of threads. By default, BlazeDS is configured to hold 10 threads, but it can be increased to several hundred depending on the server being used. Even so, this may be not enough for enterprise-grade applications that need to accommodate thousands of concurrent users.

Real-Time Messaging Protocol (RTMP) is not HTTP based. It works like a two-way socket channel without having the overhead of the AMF that is built on top of HTTP. One data stream goes from the server to the client, and the other goes in the opposite direction. Because the RTMP solution requires either a dedicated IP address or port, it is not firewall-friendly, which may become a serious drawback for enterprises that are very strict about security. Adobe has announced their plans to open source RTMP.

With a little help, however, BlazeDS can handle this level of traffic, as well as close some of the other gaps between it and LCDS. For example, the “Networking Architecture of BlazeDS” section offers a scalable solution based on BlazeDS/Jetty server. Later in this chapter, you’ll learn how to enhance BlazeDS to support data synchronization, PDF generation, and scalable real-time data push. In addition to feature support, you’ll also examine the other piece of the puzzle: increase the scalability of the AMF protocol in BlazeDS.

Why AMF is Important?

You may ask, “Why should I bother with AMFinstead of using standard HTTP, Rest, SOAP or similar protocols?”

The short answer is because the AMF specification is open sourced and publicly available (

The longer answer begins with the fact that AMF is a compact binary format that is used to serialize ActionScript object graphs. An object can include both primitive and complex data types, and the process of serialization turns an object into a sequence of bytes, which contains all required information about the structure of the original object. Because AMF’s format is open to all, Adobe as well as third-party developers can implement it in various products to de-serialize such pieces of binary data into an object in a different VM, which does not have to be Flash Player. For example, both BlazeDS and LCDS implement AMF protocol to exchange objects between Flash Player and Java VM. There are third-party implementations of AMF to support data communication between Flash Player and such server-side environments as Python, PHP, .Net, Ruby and others.

Some of the technical merits of this protocol when used for the enterprise application are:

Serialization and de-serialization with AMF is fast.

BlazeDS (and LCDS) implementation of AMF is done in C and native to the platform where Flash Player runs. Because of this, AMF has small memoryfootprint and is easy on CPU. Objects are being created in a single pass – there is no need to parse the data (i.e. XML or strings of characters), which iscommon for non-native protocols.

AMF data streams are small and well compressed (in addition to GZip).

AMF tries to recognize the common types of data, group them by type so every value doesn’t have to carry the information about its type. For example, if there are numeric values that fit in two bytes, AMF won’t use four as was required by the variable data type.

AMF supports the native data types and classes.

You can serialize and de-serialize any object with complex data types including the instances of custom classes. Flex uses AMF in such objects as RemoteObject, SharedObject, ByteArray, LocalConnection, SharedObject, and all messaging operations and any class that implements IExternalizable interface.

Connections between the client and the server are being used much more efficiently.

The connections are more efficient because the AMF implementation in Flex uses automatic batching of the requests and built-in failover policies providing robustness that does not exist in HTTP or SOAP.

The remainder of the chapter will focus on how you can leverage these merits for your own applications, as well as contrast AMF and the technologies that use it to traditional HTTP approaches.

AMF Performance Comparison

AMF usually consumes half of the bandwidth and outperforms (has the shortest execution time) other text-based data transfer technologies by three to ten times depending on the amount of data you are bringing to the client. It also usually takes several times less memory compared to other protocols that use un-typed objects or XML.

If your application has a server that just sends to the client a couple of hundred bytes once in a while, AMF performance benefits over text protocols are not obvious.

To see for yourself, visit a useful Web site that enables you to compare the data transfer performance of various protocols. Created by James Ward, a Flex evangelist at Adobe, the test site lets you specify the number of database records you’d like to bring to the client, then graphs the performance times and bandwidth consumed for multiple protocols.

Figure 6-2. James Ward’s benchmark site

Figure 6-2 shows the results of a test conducted for a medium result set of 5000 records using out of the box implementations of the technologies using standard GZip compression.

Visit this Web site and run some tests on your own. The numbers become even more favorable toward AMF, if you run these tests on slow networks and low-end client computers.

The other interesting way to look a performance is to consider what happens to the data when it finally arrives to the client. Since HTTP and SOAP are text-based protocols, they include a parsing phase, which is pretty expensive in terms of time. The RIA application needs to operate with native data types, such as numbers, dates, Booleans. Think about the volume of data conversion that has to be made on the client after arrival of 5000 of 1Kb records.

Steve Souder, a Yahoo! expert in performance tuning of traditional (DHTML) Web sites, stresses that major improvements can be achieved by minimizing the amount of data processing performed on the client in an HTML page (see High Performance Web Sites, O’Reilly, 2007). Using the AMF protocol allows you to substantially lower the need of such processing because the data arrive to the client already strongly typed.

AMF and the Client-Side Serialization

AMF is crucial for all types of serialization and communications. All native data serialization is customarily handled by the class ByteArray. When serialized, the data type information is marked out by the name included in the metadata tag RemoteClass.

Here is a small example from the Flex Builder’s NetworkingSamples project that comes with the book. It includes an application RegisteredClassvsUnregistered.mxml and two classes: RegisteredClass and Unregistered class:

package

{

[RemoteClass(alias="com.RegisteredClass")]

public class RegisteredClass{

}

package

{

public class UnregisteredClass{

}

<?xml version="1.0" encoding="utf-8"?>

<mx:Application xmlns:mx="
creationComplete="test()"

<mx:Script>

<![CDATA[

import flash.utils.ByteArray

private function serializeDeserialize(a:Object) : void {

var ba : ByteArray = new ByteArray();

ba.writeObject(a);

ba.position = 0;

var aa:Object = ba.readObject();

trace( aa );

}

private function test():void {

serializeDeserialize( new RegisteredClass());

serializeDeserialize( new UnregisteredClass());

}

]]>

</mx:Script>

</mx:Application>

Example 6-1. RegisteredClassvsUnregistered.mxml

Example 6-2. Serialization with and without RemoteObject metatag

In the example above, the function serializeDeserialize() serializes the object passed as an argument into a ByteArray, and then reads it back into a variable aa of type Object. The application makes two calls to this function. During the first call, it passes an object that contains the metadata tag marking the object with a data type RegisteredClass; the second call passes the object that does not use this metadata tag. Running this program through a debugger displays the following output in the console:

[SWF] /NetworkingSamples/NetworkingSamples.swf -
798,429 bytes after decompression

[object RegisteredClass]

[object Object]

Annotating a class with RemoteClass metadata tag allows Flash Player to store, send and restore information in the predictable, strongly typed format. If you need to persist this class, say in AIR disconnected mode or communicate with another SWF locally via the class LocalConnection, following the rules of AMF communications is crucial. In the example, RemoteClass ensures that during serialization the information about the class will be preserved.

HTTP Connection Management

To really appreciate the advantages of binary data transfers and persistent connection to the server, take a step back and consider how Web browsers in traditional Web applications connect to servers.

For years, Web browsers would allow only two connections per domain. Since Flash Player uses the browser’s connection for running HTTP requests to the server, it shares the same limitations as all browser-based applications.

The latest versions of IE and Mozilla increased default number of simultaneous parallel HTTP requests per domain/window from two to six. It’s probably the biggest news in AJAX world in the last 3 years. For the current crop of AJAX sites serving real WAN connections it means increasing the load speed and fewer timeouts/reliability issues. By the way, most of Opera and Safari performance gains over IE and Mozilla in the past are attributed to the fact that they allowed and used four connections ignoring the recommendations of the WWW consortium (they suggested allowing only two connections).

The fact that increasing the number of parallel connections increases network throughput is easy to understand. Today’s request/response approach forbrowser’s communications is very similar to the village bike concept. Imagine that there are only a couple of bikes that serve the entire village. People ride and come back to give it to the next person in line. People wait for their turns, keeping their fingers crossed that person in front of you won’t get lost in the woods during his ride. Otherwise, you need to wait till all hopes are gone (called timeout) and the village authorities provide you with a new bike circa 1996.

Pretty oftenby the time the new bike arrives it’s too late, the person decided toget engaged in a different activity (abandon this site). As the travel destinations become more distant (WAN) you are exposed to real world troubles of commuting - latency (500ms for geostatic satellite network), bandwidth limitations, jitter (errors), unrecoverable losses, etc. Besides that, the users may experience congestions caused by the fact that your ISP decided to make some extra cash by trying to become a TV broadcaster anda phone VOIP company, but lacks required infrastructure.The applications that worked perfectly on local/fast networks will crumble in every imaginable way.

Obviously, more bikes (read browser’s connections) mean that with some traffic planning you can offer a lot more fun to the bikers - get much better performance and reliability. You might even allocate one bike to sheriff/fireman/village doctor so he will provide information on conditions and lost/damaged goods carried by the bikers. You can route important goods in parallel so they would not get lost or damaged that easy.

You can really start utilizing long running connection for real data push now. But first, let’s go ten years back and try to figure out how the early adopters of RIA developed with AJAX were surviving.

Even though AJAX as a term has been coined only in 2005, the authors of this book started using the DHTML/XMLHttpRequest combo (currently known as AJAX) since the year 2000.

The Hack to Increase Web Browser’s Performance

In the beginning of this century, most of enterprises we worked with quietly rolled out in the browser builds/service packs increasing the number of allowed HTTP connections. This was just a hack. For Internet Explorer – the following changes to Windows registry keys would increase the number of the browser connections to 10:

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings

MaxConnectionsPer1_0Server 10

MaxConnectionsPerServer 10

With Mozilla’s Firefox you have to recompile the source code of the entire browser.

It does solve most of performance and reliability issues for a short while. The main reason is that without imposed limits, software increases in size faster than Moore’s law for electronics. And unlike in private networks in enterprises, without proper “city framework” rampant requests will cause overall Internet meltdown as initial rollout of more capable browser will give them unfair advantage in terms of bandwidth share.

If a server receives eight connection requests, it’ll try to allocate the limited available bandwidth accordingly, and say Mozilla’s requests will enjoy better throughput than Internet Explorer, which on older and slower networks will causequality of service (QoS) problems. In other words, this solution has a very real potential to cause more of the same problems it’s expected to solve.

Other Ways of Increasing Web Browser’s Performance

Most enterprises have to control QoS of their clients’ communications. For example, a company that trades stock has a service level agreement (SLA) with their clients promising pushing the new price quotes twice a second. To keep such a promise the enterprises should create and adopt a number of point-to-point solutions that provide more efficient communication models, which fall into three categories: