Contents

1.  Introduction.....………………………………………………………………1

2.  Digital Signatures…………….……………………………………….……..1

3.  XML Signature Fundamentals.……………………………………….……..2

4.  Types of XML Signatures.…………………………………………………..3

5.  XML Signature Processing………………………………………………….4

6.  XML Signature Elements…………………………………………………....6

7.  XML Canonicalization……………………………………………………...10

8.  Algorithms…………………………………………………………………..11

9.  Conclusion…………………………………………………………………..14

10.  References…………………………………………………………………...15

1. Introduction

Security in web applications is a very important issue for today’s businesses. Current developers of business applications try to come out with solutions to offer secure business transactions for companies deploying applications that satisfy their needs. Likewise, companies look for applications that bring them security in their transactions not only in a conceptual business-to-costumer flow but also in a business-to-business direction.

XML Signatures is a developing technology that provides more secure web applications for businesses. XML has surged as the predominant language used for web applications. One of the XML properties is its portability among different platforms and its semi-structured storage model ideal for interaction with current database management systems. XML is a text based conceptual language, leaving aside the formatting.

XML Encryption and XML Signature are fundamental to the next generation of emerging technologies that use these two standards as building blocks, like WS-Security, XML Key Management Specification (XKMS) or SAML.

2. Digital Signatures

Digital signatures serve to identify the origin of a document. But that is not all, digital signatures protects the integrity of data and can detect any changes made to it while it is in route to its recipient. Furthermore, authenticity is obtained by the sender’s identity. A message is typically signed using the private key of the sender and verified by the sender’s public key. This prevents an attacker to pretend to be the sender forging a message.

In a digital signature process, a message is usually taken through a series of phases where algorithms sign the message. First, the message is hashed using a cryptographic hash function returning a hash value of the message. Then the hash value is signed using a signing algorithm and the sender’s private key to produce a signature value.

The receiver then starts the verification process in which the received message is hashed with the same hash function used when signing, then the signature value is verified by passing it, along with the public key and the computed hash, to the signing algorithm. If the computed hash and the signature hash match, then the signature is valid. The signature process is explained in detail in section five.

3. XML Signature Fundamentals

XML Signature is a joint standard from W3C and IETF organizations for digitally signing all of an XML document, part of an XML document or even an external object. Pointing a Uniform Resource Locator, you can sign pretty much anything you can, from regular text to images and pictures.

A standard XML document is formed of a set of one or more elements enclosed by a root element. A schema such as DTD or XSchema provides a determined structure of a XML document. This is a free-error approach in business-to-business communication in order to have an established well-formed document used in applications. Thus, business applications communicate through XML documents based in schemas agreed by both entities.

An XML Signature is itself a piece of XML with its corresponding schema determining how this XML document will be structured. Within the XML Signature itself are references to sources that will be digitally signed. The source indication is part of the Reference element which has an attribute URI (Uniform Resource Identifier) that points to an internal or external object. A single XML document can contain multiple XML Signatures each referring to a different object.

Let’s take a look at how a XML Signature looks like.

4. Types of XML Signatures

XML Signatures allows signing an internal or external resource. An internal resource might be a XML node while an external resource can be a binary or non-XML file (image or text document), another XML document or a node within another XML document. The type of an XML Signature depends on whether the resource is an internal or external resource. There are three types of XML Signatures: enveloping, enveloped and detached.

Enveloping Signature

An enveloping signature wraps the item that is being signed. The reference is to an XML element within the signature element itself. The following diagram depicts this signature.

Enveloped Signature

The Reference element of a signature points to a parent XML element.

Detached Signature

A detached Signature points to an XML element or binary outside the signature element’s hierarchy. The item being pointed to is neither a child nor a parent. It could point to an element within the same document or to another resource completely outside the current XML document.

An XML Signature can be enveloping, enveloped and detached all at the same time. The signature element can contain more than one Reference element which can be enveloping, enveloped or detached.

5. XML Signature Processing

How XML Signature Works?

XML Signature technology standard is composed of two processes; the Signing process takes place in the sender end and the Verification process in the recipient end. Before we describe the processes, let’s define a message digest. A message digest is a short representation, usually 20-bytes, of the full message. This message digest is created by applying a hash function to the message. The created message digest can be used as a proxy for the original message. This hash function needs to be fast because you need to run this function on both the sending and receiving ends of communication. The process is as follows:

Signing Process

1.  Create a message digest by hashing the entire plaintext message.

2.  Encrypt the message digest using the sender’s private key.

3.  Send original plaintext message and the encrypted message digest along with the sender’s public key to any recipients.

Verification process

1.  Recipient receives the plaintext message and the encrypted message digest from the sender.

2.  Recipient receives the sender’s public key. (Public key may or may not be sent with the signature)

3.  Recipient runs the original plaintext message through the same SHA1 hash algorithm originally performed by the signer.

4.  Recipient uses the sender’s public key to decrypt the message digest.

5.  Finally, a bit-to-bit comparison is done between the message digest computed in the receiver’s end and the one decrypted in the receiver’s end too.

The hash function task is to avoid for two messages to create the same message digest. If that occurs, an attacker could substitute a new message for the original and fool the recipient into thinking the new fraudulent message is the correct one. Therefore, excellent collision avoidance is the fundamental property for hash functions used to create a message digest. Some of these hash functions algorithms are MD4, MD5 and SHA1, being the first two avoided for its weakness found, while SHA1 is the current algorithm used by security systems and web services security. When a message of any length < 264 bits is input, the SHA produces a 160-bit message digest output. Message digest is then input to the DSA (Digital Signature Algorithm) which computes the signature of the message. Signing a message digest instead of the entire message improves efficiency since the message digest has a shorter length. The verifier or receiver of the signature should obtain the same message digest when the received version of the message is used as input to SHA1. SHA1 hash algorithm which stands for Secure Hash Algorithm will be discussed in detail later.

6. The XML Signature Elements

Like any regular XML document, a XML digital signature file is composed of tag elements. The Signature tag is the root element of the document and contains four child elements: SignedInfo, SignatureValue, KeyInfo and Object elements, being the last two optional. Each of these elements contains even more elements, forming a complex structure that depicts a XML Signature. We will describe the syntax as well as the functionality of each of these elements.

Format of a signature

XML digital signatures use a single namespace that must be declared in each document.

Within a XML Signature, the URIs identify resources, algorithms and semantics.

Signature Element

The top-level element is the Signature element. It contains information about what is being signed, the signature, the keys used to create the signature, and a place to store arbitrary information.

An XML digital signature is represented by the Signature element which posses the following structure. (Note: “?” denotes zero or one occurrence, “+” denotes one or more occurrences,“*” denotes zero or more occurrences and elements between parentheses are optional.

In order to provide a generic structure of a XML digital signature, the signature document is validated against its schema. The following is the schema for the Signature element.

The ID attribute in the Signature element allows a document to have multiple signatures and provides a way to identified particular instances.

SignedInfo Element

This element is the most complex element. It contains information about the SignatureValue element and information about the content application. Also it contains the information that is actually signed.

It is in this element where the canonicalization process takes place. Canonicalization, or C14N, is the process of picking one path through all the possible output options, so that sender and receiver can generate the exact same byte value, no matter what intermediate XML software might be involved.

The SignatureMethod element specifies what type of signature (Kerberos or RSA) is used to create the signature. Taken together, these two elements (CanonicalizationMethod and SignatureMethod) tell us how to create the digest, and how to protect it from modification.

Reference Element

The Reference element is contained inside the SignedInfo element. A signature can have multiple references to objects such as all parts in a MIME message, an XML file and the XSLT script that converts it to HTML, and so on. The power and flexibility of URIs to point to just about any type of resource are critical to the power and flexibility of XML Signature. The Reference element has the following schema:

The URI attribute of the Reference element defines the location of the resource object to be signed.

The child Transform element from the Reference element specifies how to process the data before hashing. It gives control over the content signed by allowing you to modify the data for a reference before the hash value for that data is generated. For example, in an enveloped signature, the transform element removes the Signature node from the XML document before signing it.

SignatureValue Element

It contains the actual signature encoded in Base-64 form. Base-64 encoding is used pervasively in XML-related applications. Base-64 encoding is a convenient, well-defined encoding mechanism for creating a unique, printable representation of arbitrary binary data. The following is the schema for the SignatureValue element with its corresponding example.

Object Element

The Object element represents an item to be signed as part of the signature element. The following is its schema. Think of the Object element as the place to put the thing that is being signed when you have an enveloping reference.

KeyInfo Element

The job of the optional KeyInfo element is to protect the digest from being modified. It contains specific information used to verify an XML Signature.

7. XML Canonicalization

Digital signatures only work if the verification calculations are performed on exactly the same bits as the signing calculations. If the surface representation of the signed data can change between signing and verification, then some way to standardize the changeable aspect must be used before signing and verification. For example, even for simple ASCII text there are at least three widely used line ending sequences. If it is possible for signed text to be modified from one line ending convention to another between the time of signing and signature verification, then the line endings need to be canonicalized to a standard form before signing and verification or the signatures will break.

XML is subject to surface representation changes and to processing which discards some surface information. For this reason, XML digital signatures have a provision for indicating canonicalization methods in the signature so that a verifier can use the same canonicalization as the signer.

The kinds of changes in XML that may need to be canonicalized can be divided into four categories. There are those related to the basic XML. There are those related to DOM, SAX. Third, there is the possibility of coded character set conversion, such as between UTF-8 and UTF-16, both of which all XML compliant processors are required to support. And, fourth, there are changes that related to namespace declaration and XML namespace attribute context.

8. Algorithms

Algorithms are identified by URIs that appear as an attribute to the element that identifies the algorithms' role. The four algorithms used in a XML digital signature are:

1- CanonicalizationMethod.

2- SignatureMethod

3- Transform

4- DigestMethod

1-Canonicalization Algorithm

The CanonicalizationMethod identifies the algorithm that is used to canonicalize the SignedInfo element before it is digested as part of the signature operation. Canonicalization is how the process deals with different data streams that can be contained inside the same data element. For instance, there could be two different ways to represent the text. Canonicalization is the method in which raw data is interpreted to have spaces displayed as spaces and not as ASCII code. It is used to ensure that XML is handled consistently by different XML processors in light of white space and other variations.

Digest algorithms requires content to be exactly the same to produce the same digest. Even a minor change that does not change the meaning such as adding an extra space will invalidate the digest. XML, on the other hand, allows some variation in the syntax of the XML text without changing the document. In other words, two XML documents may be considered the same even if they do not have the exact same text. For example, one XML document may use single quotes for an attribute and other use double quotes. These are the same to an XML parser, but very different to a digest algorithm. There is an entire list of such potential issues for digests. To get around this problem, a Canonicalization transform may be used, one that converts any XML document to a form using a single set of rules, such as always using a certain type of quote for attributes.