Introduction to Java Servlets

Table of Contents

Introduction to Server-Side Technologies

Dynamic Generation of Web Pages

Basic Technology Behind Dynamic Web Pages

Getting Started with Java Servlets

Installing Apache Tomcat

Compiling and Deploying Java Servlets in Tomcat

Template Servlet Structure

Hello World Servlet

Handling Requests with Servlets

HTTP Request Headers

Handling HTML forms

Bibliography

Introduction to Server-Side Technologies

Dynamic Generation of Web Pages

Web-based information systems support remote access to Web pages. The Web pages served by such systems come in two flavours. Either they can be static Web pages that are stored and served as files on the server file system, or they can be generated dynamically on the server side. The dynamic Web pages may be generated as a response to a search query, database query, or purchase request submitted by a user of the system. Generally, there are few reasons for dynamic generation of Web pages [HallBrown2003]:

  • Web pages are based on user queries, e.g. result pages of search engines, shopping carts in online shops, and so on.
  • The data managed by the system changes often, e.g. wheather forecast sites, news tickers, and similar.
  • Web pages are based on data from a database system, e.g. students data from university databases, online flight reservation, etc.

There exist a number of server-side technologies for dynamic generation of Web pages. The basic server-side technology is CGI (Common Gateway Interface). It is actaully a standardized specification [CGI1995] of communication between a Web server and external programs. Thus, an external CGI program runs directly on the server side and generates a dynamic Web page. The Web server invokes the external program by passing parameters to it (e.g. user query). The generated Web page is passed back from the external CGI program to the Web server, which in turn forwards the page to the user. All the communication between the Web server and external program is carried using the standard input and output streams.

The simplicity of the CGI specification and the fact that CGI programs can be written in any programming language lead to a widespread of CGI-based applications. However, CGI programs have a number of serious drwabacks. The major drawback of CGI programs is the performance issue. Since CGI programs are external programs, the operating system starts a new process for each request to a CGI program. This of course bring the performance costs. Further, there is no possibility for a CGI program to keep database connections open over a number of requests. For each request a new database connection must be established, which leads to significant performance costs in database centric applications.

Sun's answer to the CGI technology are Java servlets. Java servlets are Java programs running on the side of a Web server and producing dynamic Web pages. Similar to the CGI specification the Java Servlet Specification [JavaServlet2003] standardized how Java programs run and communicate with a Web server to produce dynamic Web pages. The result of the specification is the Java Servlet Application programming interface (API), which is a Java library with classes needed to write Java servlets that produce dynamic Web pages. The current version of the Java servlet Specification and Java Servlet API is 2.3. The version 2.4 has currently the status of a proposed draft. The reference implementation of the Java Servlet Specification is Apache Tomcat [Tomcat1999]. The version 4.1.27 of Tomcat implements the Java Servlet Specification 2.3. Actually, Tomcat is a so-called servlet engine, which is a Java program that provides an execution context for a number of Java servlets, which run within separate Java threads inside the Tomcat process. Tomcat provides all the communicational functions between servletsand Web server, as specified in the Java Servlet Specification.

Java servlets have a number od advantages over traditional CGI technologies for generating dynaimc content on the Web. These advantages include [HallBrown2003]:

  • Efficiency. As mentioned before, for each request for a CGI program the operating system starts a new process to handle that request. In the case of Java servlets, there is only one process, that of the servlet engine. Each Java servlet is just a thread running within the context of the servlet engine process. Of course, starting and stopping a thread costs a lot less in terms of performance than starting and stopping of an operating system process. Especially, if the execution time is small. Java servlets are also more efiicient in terms of memory usage. Thus, in the case of multiple synchronous requests for a single servlet the servlet code is loaded only once and multiple threads are started with the same code. Contrary to that for multiple requests for a CGI program, the operating system must load the code as many times as there are requests. Finally, servlets can keep track of consecutive requests, and store some internal data that help improving performance. For instance, servlets can keep database connections open and thus reduce greatly costs of establishing a connection with a database with each new request.
  • High capability. Servlets can share their data, which makes it possible to implement database connection pools, thus greatly reducing costs of establishing database connections, not only within one servlets but among a number of servlets. Further, servlets can keep information from one request to another, making it possible to easy track user's session, cache results from previous computations, and so on. Finally, servlets can talk directly to the Web server and access its data stored in standard places.
  • Portability. Servlets are Java programs written using a standardized API. Thus, servlets written for one platform may be easily ported to another without no need for changing them. Servlet engines are available for all major platforms and all major Web server products.
  • Java software libraries. Since servlets are Java programs they can use the standard Java API or any other Java library available. Thus, libraries for manipulating connections with database managment systems (JDBC libraries), libraries for manipulating digital images, Java XML libraries, etc. are all accessible in Java servlets.

Basic Technology Behind Dynamic Web Pages

The Web utilizes a classical client-server architecture. Thus, a Web client sends a request over the Internet to a Web server asking it for a specific Web page. To address a Web page the client specify its Uniforme Resource Locator (URL) to the server. Communication between the client and the server is carried out by means of HyperText Transfer Protocol (HTTP). HTTP is a simple text-based protocol. An HTTP request in its simple case can consist of a single request line containing only GET keyword together with the URL of a Web page. As the answer to this simple request the server will respond by sending the requested page back to the client.

In the general case, an HTTP request consists of:

  • Request line, which contains an HTTP method (usually GET or POST) and a URL. GET method just retrieves the data from the server, whereas POST method implies that the client sends data to the server.
  • A number of HTTP headers, which set properties of connection between the client and server. For example, Accept header specifies which MIME types are preferred by the client, Accept-Language header specifies which language is preferred by the client if the server has versions of the requested page in different languages, User-Agant specifies type of the client (browser), and so on. All HTTP headers are optional, except Content-Length header, which is required for POST method to specify how much data is sent from the client to the server.
  • Content, that is data sent by the client to the server in the case of POST method.

Mostly, dynamic Web pages are generated in accordance with parameters sent by users to a Web server. For instance, to submit a search query to a search engine users type in their serach terms in an HTML form and press the submit button to send the data. HTTP utilizes two possibities to send such parameters from the client to the server:

  • With GET method parameters are encoded in the URL included in the request line. Parameters are submitted as key-value pairs connected with equal '=' sign. Multiple paramaters are separated with ampersand '&' sign. Spaces are encoded as plus '+' sign, and special characters are encoded as a hexidecimal value preceeded with percentage '%' sign. The encoded parameters are preceeded with a question mark '?' sign and attached to the original URL. A typical example of encoding user parameters in URL looks as follows.

Example1.Encoding Parameters in URL

With the above URL parameters for a search query are sent to the Google search engine. The parameters come in key-value pairs: q=Java Servlets, ie=UTF-8, oe=UTF-8, hl=de, btnG=Google Suche, and meta parameter has no value.

Since the length of a URL is limited to 1024 bytes this method allows only limited number of parameters to be transmitted.

  • With POST method parameters are sent as the content of the request. The length of the content is specified as a special Content-Length HTTP header. POST method allows sending of binary data as content, or even mixed binary and text data as content. This is very often used for uploading binary data, such as digital images, compressed files, etc. to the server. On the client side POST method is usually applied within HTML forms. Here is a typical example of such an HTML form.

Example2.POST Method with HTML Form

<form action ="http:coronet.iicm.edu//Form" method="POST">

First Name:

<input type="text" name="name" size="20" maxlength="50">

Second Name:

<input type = "text" name = "second_name" size = "20" maxlength = "50">

Matrikel Number:

<input type = "text" name = "nr" size = "20" maxlength = "50">

Study Field:

<select name="study_field">

<option value="F874">Telematics

<option value="F860">Technical Mathematics

<option value="F033523">Software Development

<option value="F033211">Telematics Bachelor

<option value="F033221">Geomatics

</select>

<input type="submit" value="Register">

</form>

Thus, parameters sent to the server are name, second_name, nr, and study_field.

Getting Started with Java Servlets

Installing Apache Tomcat

The first step in working with Java servlets is installing the software, so-called servlet engine, that implements the Java Servlet Specification and provides the Java Servlet API. The reference implementation of servlet engine is Apache Tomcat [Tomcat1999]. The current version of Apache Tomcat is 4.1.27. Apache Tomcat is an open source software product released under Apache Software Licence [Apache2000].

Installing Apache Tomcat is quite simple. Firstly, the appropriate version for the operating system must be downloaded. All Apache Tomcat versions might be obtained as source files, or as precompiled binary files. Building the system from source files requires some additional software, which can be obtained from the Apache Java Web site [Jakarta2003]. The complete instructions for buliding and installing Apache Tomcat from the source files can be found on Apache Tomcat Web site.

On the other hand, installing the binary version of the system can be accomplished in only few steps. For both Linux and Windows operating systems a similar installation procedure might be applied. Here the Linux procedure is explained, to install the system on Windows some small modifications are needed (e.g. *.bat files instead of *.sh files, c:\tomcat instead of /tomcat, etc.).

  • Decompress the downloaded binary archive into a directory in your file system, lets say in "/tomcat" directory.
  • Change to /tomcat/bin directory and make the start and stop scripts (startup.sh and shutdown.sh) executable.
  • Invoke the start and stop scripts to start/stop the system.

Example3.Installing Tomcat on a Linux machine

#installation in directory /tomcat

cd /tomcat

tar xzf <path-to-tomcat-binary-archive>/jakarta-tomcat-4.1.27.tar.gz

cd jakarta-tomcat-4.1.27

#make scripts executable

chmod +x bin/*.sh

#start tomcat (windows: use bin/startap.bat)

bin/startup.sh

#stop tomcat (windows: use bin/shutdown.bat)

bin/shutdown.sh

There is a Windows installer version available for Windows operating system. To install the system with this binary distribution just double click on the downloaded executable archive and follow the instructions on the screen. A nice feature of this distribution for Windows XP/2000/NT operating system is that the system is automatically installed as a Windows service, which may be controlled from the Managment Console available in Control Panel.

Once when the system is running it can be accessed with a standard Web browser under or

Compiling and Deploying Java Servlets in Tomcat

Since Java servlets are typical Java programs to compile them you must use a standard Java compiler. The CLASSPATH environment variable must include the Java Servlet API, which comes with Apache Tomcat. The library is stored under the common/lib/servlet.jar in the Tomcat installation directory (e.g. /tomcat/common/lib/servlet.jar). After compiling all necessery Java source files servlets need to be deployed in Tomcat.

Apache Tomcat works with so-called Web applications. A Web application is a collection of one or more servlets combined with external Java libraries, static resources such as digital images, static HTML pages, etc. to provide a specific functionality. For instance, online shopping application might be realises as a Tomcat Web application.

Each Tomcat Web application has the same predefined structure:

  • All Web applications are stored as directories under the webapps directory (e.g. /tomcat/webapps) directory of the Tomcat installation. Web application name is identical with the name of its directory, e.g. a Web application called "online-shop" is stored in the directory called online-shop (e.g. /tomcat/webapps/online-shop) and it is accesible via or
  • Static resources of a Web application (e.g. HTML pages, images, etc.) are stored in the Web application directory.
  • There is a special subdirectory called WEB-INF of the Web application directory, e.g. /tomcat/webapps/online-shop/WEB-INF. The WEB-INF directory contains two subdirectories: classes (e.g. /tomcat/webapps/online-shop/WEB-INF/classes) and lib (e.g. /tomcat/webapps/WEB-INF/online-shop/lib) directory. The first of these two subdirectories contains all Java class files required to run a particular Web application. The lib directory contains external Java libraries (e.g. Java archive - jar files) needed to run the Web application. For example, if the Web application connects to a database managment system, the Java library (JDBC driver) needed to establish the connection is placed in the lib directory.
  • There is a special file called web.xml in the WEB-INF directory. This file includes all configuration directives for a particular Web application, in the form of key-value parameters that are paased to a servlet when it is initialized. For example, the username and password for a user of the backend database managment system might be defined in the web.xml file. Further, in this file the typical description of all Java servlets from a particular Web application is provided. This description includes the unique servlet name, name of the Java servlet class, and a number od additional servlet attributes, such as URL mapping for the servlet, and so on.

Example4.Template Structure of Tomcat Web Application

|-tomcat

| |

||-webapps

|||

|||-online-shop

||||

||||-WEB-INF

|||||

|||||-web.xml

|||||

|||||-lib

||||||

||||||-*.jar (e.g. mysql-connector.jar)

|||||

|||||-classes

||||||

||||||-*.class (e.g. ShopingCartServlet.class)

||||

||||-*.htm (e.g. navigation.html)

||||

||||-*.gif, *.jpg (e.g. shopping_cart.jpg)

Example5.Typical web.xml file

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE web-app PUBLIC

"-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"

"

<web-app>

<servlet>

<servlet-name>Shopping Cart</servlet-name>

<description>Keeps track of items that user bought</description>

<servlet-class>ShoppingCartServlet</servlet-class>

<init-param>

<param-name>database-username</param-name>

<param-value>dhelic</param-value>

</init-param>

</servlet>

<servlet-mapping>

<servlet-name>Shopping Cart</servlet-name>

<url-pattern>Basket</url-pattern>

</servlet-mapping>

</web-app>

Template Servlet Structure

The Java Servlet API [JavaServletAPI2003] a number of Java classes which are used for developing of Java servlets. This API consists of two Java packages: javax.servlet and javax.servlet.http. The first package contains Java classes and interfaces which implement a generic servlet behavior, whereas the second javax.servlet.http package provides Java classes and interfaces that handle more specific servlet behaviour in an HTTP based environment. Thus, the most of the time programmers work with the classes from the second package. These two packages provide an object-oriented abstraction of the underlying networking technology. For example, to handle HTTP GET method programmers only need to implement a method in a Java class, to obtain parameters sent by a user they call methods on a high-level Java object representing the request sent by the user, and so on.

A Java servlet is a normal Java class which is defined as a subclass of the abstract class HttpServlet from the javax.servlet.http package. The abstract HttpServlet class has a number of public methods, each of them corresponding to an HTTP method, such as GET or POST method. A subclass of the Http Servlet class must implement at least one of these methods to handle the corresponding HTTP method. Here are some of the most important methods from the public interface of the HttpServlet class:

  • doGet() method, for handling HTTP GET requests
  • doPost() method, for handling HTTP POST requests
  • doDelete() method, for handling HTTP DELETE requests
  • doPut() method, for handling HTTP PUT requests

Usually, for generation of dynamic Web pages doGet() and/or doPost() methods are implemented. These methods are called by the servlet engine whenever an HTTP request with the corresponding HTTP method is issued to the server. Normally, the servlet engine handles only a single instance of a particular Java servlet. For each new request to this servlet the sevlet engine starts a new thread and invokes the corresponding method of the servlet within the execution context of the new thread.