Guidelines for using XML for Electronic Data Interchange

Version 0.05

25th January 1998

Editor: Martin Bryan, The SGML Centre

Contributors: Members of the XML/EDI working group, including Benoít Marchal, Norbert H Mikula, Bruce Peat and David RR Webber.

XML/EDI Group Home Page URL:

Copyright © 1998. XML/EDI Group. All rights reserved, no part of this document may be commercially reproduced in part or in whole without consent and prior approval.

Changes made to this version

Addition of figures used in presentation to W3C in January 1998.

Brief explanation of differences in business processes between client-centric electronic business transactions and server-centric web retailing.

Rules templates now linked to messages via a processing instruction rather than XLL simple link (for conformance to way in which style sheets are linked to the message.

The examples in Annex A have been updated to show an XML book order that can be displayed using Micorsoft's MSXSL beta add-on to Internet Explorer 4.0. On-line link to demonstration software now provided.

Contents

  1. Purpose & Goal of the XML/EDI Guidelines
  2. Definitions for XML/EDI
  3. The standards involved in XML/EDI
  4. Scope of XML/EDI
  5. Business-to-business Electronic Data Interchange
  6. Electronic business transactions
  7. Base Technologies of XML/EDI
  8. Why use XML?
  9. Integrating XML with EDI
  10. XML/EDI Components
  11. Types of applications
  12. Lexicon Repositories
  13. XML/EDI Data Manipulation Agents (DataBots)
  14. XML/EDI Business Objects
  15. XML/EDItors
  16. XML/EDI extensions for message stores
  17. Search Agents
  18. Trading Partner Pages
  19. The Implementation Process
  20. Using XML for Electronic Data Interchange
  21. Identifying data sets
  22. Developing DTDs
  23. Application specific extensions
  24. Creating message instances
  25. Validating messages
  26. Exchanging messages
  27. Processing messages
  28. Activating rules
  • Appendix A1: Using XML/EDI for Book Ordering
  • Applying XML/EDI to Book Ordering
  • Glossary
  • Bibliography

1. Purpose & Goal of the XML/EDI Guidelines

Put simply, the goal of XML/EDI is to deliver unambiguous and durable business transactions via electronic means.

Associated with this is a goal to establish a standard for commercial electronic data interchange that is open and accessible to all, and which delivers a broad spectrum of capabilities suitable to meet the full breadth of business needs.

To achieve this requires the use of a methodology that it is not only extensible enough to meet future requirements but also adaptable enough to incorporate new technologies and requirements as they emerge. To ensure broad adoption the technology selected needs to be widely and freely available. The Extensible Markup Language (XML) developed by the World Wide Web Consortium (W3C) provides such a freely available, widely transportable, methodology for well-controlled data interchange.

XML was designed principally for the exchange of information in the form of computer displayable "documents". Not all commercial data is interchanged in a displayable format. In particular data designed for electronic data interchange typically needs to be processed before it can be displayed. For this to be possible the data must be mapped, using some form of template, to a set of processing rules. These XML/EDI guidelines provide a standardized way in which such rules templates can be added to interchanged data.

These XML/EDI guidelines begin by formally defining the terms used in the text. This is followed by an impact statement that makes predictions from various viewpoints. The guidelines then give a background on the tools and standards which XML/EDI is built.

Note: These guidelines form the basis for development work on XML/EDI. They form an precursor to a formal "Specification of an EDI Application for XML". As a document designed to be a lighting rod for ideas, this working document has been, and will continue to be, released in draft form. Comments on this draft should be sent to the XML/EDI working group at .

2. Definitions for XML/EDI

Electronic commerce has been defined in the European Workshop on Open System's Technical Guide on Electronic Commerce (EWOS ETG 066) as "Electronic exchange of data to support business transactions, i.e. the exchange of value through the delivery of a product from a seller to a buyer". As such it encompasses much more than what has been possible using traditional methods of Electronic Data Interchange (EDI) such as EDIFACT. Electronic commerce is defined by EWOS as covering activities such as marketing, contract exchange, logistics support, settlement and interaction with administrative bodies (e.g. tax and custom data interchange). Electronic commerce covers all industrial and service operations, including services such as insurance, healthcare, travel and interactive home shopping.

Many people use the term EDI to refer to the set of messages developed for business-to-business communication as part of the United Nations Standard Messages Directory for Electronic Data Interchange for Administration, Commerce and Transport (EDIFACT). EDIFACT messages are transmitted in compressed form, using predefined field identifiers, which must occur in a predefined sequence. While EDI is, strictly speaking, wider in scope than EDIFACT, for the purposes of these guidelines EDI will be used in this restricted sense when not otherwise qualified.

The basic unit of information in an EDI message is the data element. For an EDI invoice, each item being invoiced would be represented by a data element. Data elements can be grouped into compound data elements, and data elements and/or compound data elements may be grouped into data segments. Data segments can be grouped into loops; and loops and/or data segments form business documents.

The EDIFACT standards define whether data segments are mandatory, optional, or conditional, and indicate whether, how many times, and in what order a particular data segment can be repeated. For each EDI message, a field definition table exists. For each data segment, the field definition table includes a key field identifier string to indicate the data elements to be included in the data segment, the sequence of the elements, whether each element is mandatory, optional, or conditional, and the form of each element in terms of the number of characters and whether the characters are numeric or alphabetic. Similarly, field definition tables include data element identifier strings to describe individual data elements. Element identifier strings define an element's name, a reference designator, a data dictionary reference number specifying the location in a data dictionary where information on the data element can be found, a requirement designator (either mandatory, optional, or conditional), a type (such as numeric, decimal, or alphanumeric), and a length (minimum and maximum number of characters). A data element dictionary gives the content and meaning for each data element.

Originally, EDI translation software was developed to support a variety of private system formats. Most often, the sender and receiver were required to contract in advance for a tailored software program that would be dedicated to mapping between their two types of datasets. Each time a new sender or receiver was added to the client list, a new translation program would be needed by the new party to format their data to conform to the standards in use by the participants. Of course, this becomes expensive. Such static systems do not easily allow synchronization of business transactions in distributed business processes that involve global rules, but with participants and actions that are not predetermined. To solve these issues it is desirable to develop automated tools and techniques that are easy to use and allow decomposition of transactions in actions to be performed locally and mapping of local actions onto efficient protocol exchanges.

The Electronic Enterprise

The concept of the Electronic Enterprise requires a transition away from paper form based EDI. Key concepts that are required are the encapsulation of agreed sets of business rules (in EDI parlance the Implementation Guidelines) and also mechanisms to handle state and flow control (such as those provided by hyperlink anchors in HTML files). Also message sets must be able to handle partial information, where the complete information is not yet available, or simply is not required for the particular business process. This allows different parts of an enterprise to selectively contribute only the information that is germane to their business functions.

A fundemental difference between the proposals in these XML/EDI Guidelines and those found in other proposals for XML-based web retailing, such as those covered in the Open Trading Protocol (OTP), is the client-centric nature of the business processes, as contrasted with the server-centric nature of electronic retailing. To distinguish these two terms, we use the term "Electronic Business" to refer to the processes of fulfilling customer requirements through the application of negotiated business processes leading to the supply of manufactured goods to retailers and service providers, and "Web Commerce" to describe the process of selling manufactured goods to consumers.

Electronic business is client-centric in that is starts with a specification of a client's requirements, rather than a statement of what the supplier has to offer. The specification of requirements gets sent to a number of potential suppliers, who are asked to tender for the business by a predefined date/time. The purchaser is, as a result of this process, provided with more than one choice, and must determine which quotation to accept. This may require a period of contract negotiation to ensure that adequate terms and conditions, including delivery criteria, are met. This may require a looping of the processes, with a need to cross-refer between successive documents.

Once the purchaser has selected a supplier the business processes involved are very similar to those involved in web commerce, but there are subtle differences. For example, electronic payment before delivery is unlikely to be required for electronic business transactions. Instead of being an integral part of the negotiation phase, with payment being made at the time the order is placed, payment in the electronic business scenario is a separate process that occurs immediately after delivery. This introduces concepts such as statements, which do not occur in web commerce scenarios.

The standards involved in XML/EDI

XML is the Extensible Markup Language subset of ISO's Standard Generalized Markup Language (SGML) developed by the World Wide Web Consortium (W3C) SGML on the Web working party during the latter half of 1996 and early 1997. The formal recommendation was submitted for approval by W3C members on 8th December 1997.

On 10th September 1997 a proposal for a new form of XML Style Language (XSL), which incorporates the ECMAScript standardized variant of JavaScript, was published by a consortium led by Microsoft, ArborText and the Inso Corporation. This version of the XML/EDI specification uses the power provided by this new advanced language combination to show how control of XML/EDI document processes can be achieved in a distributed manner.

In October 1997 a specification for a formal Document Object Model (DOM) for XML documents was published by W3C. This model provides a standardized API for XML-based tools.

Combining XML and EDI to develop XML/EDI suggests that the main method of capturing and coding EDI information will be through XML-coded electronic forms. At present the form handling characteristics of XML are yet to be fully agreed (agreement is expected during 1998). To allow interaction with existing sytems the XML/EDI Guidelines show how EDIFACT messages can be generated from XML/EDI forms, and vice versa.

XML/EDI isn't creating a new standard. XML/EDI is defining how companies can use current standards to solve their business problems.

3. Scope of XML/EDI

Detail of the scope of XML/EDI, and the impact it is expected to have on business communities, are covered in Introducing XML/EDI.... To help readers of this document to appreciate the differences in practice between traditional EDIFACT-based web transactions and XML/EDI this section discusses some of the differences between traditional business-to-business electronic data interchange systems and the new breed of interactive electronic business tools being provided through the Internet.

Business-to-business Electronic Data Interchange

Electronic Data Interchange (EDI) has been used for business-to-business communication for almost a quarter of a century. Initial efforts involved inter-company agreements on how to exchange commercial data, initially as information stored on tape and later as messages sent over dedicated data lines. To avoid having to use different protocols to move data between different companies, various industry groups identified sets of data that could form the basis of individual agreements. The industry groups also sought to agree the format in which fields in such data sets were interchange so that a company only needed to develop one methodology for decoding information received without resource to human intervention.

The Achilles Heel for this approach has always been two fold. Firstly, companies require flexibility in, and wish to deviate from, doctrinaire standards that do not fully meet their business needs. Secondly, because the standards are pre-ordained there is no mechanism provided to transfer processing rules and associated information. It is assumed that the data meets the defined constraints and if not, has been duly modified to conform. This means that companies must conduct exacting analysis to determine precisely how they are going to move their business data to and from the predefined EDI formats. The cost of these constraints has been borne as excessively long and complex implementation cycles for traditional EDI systems.

The world has changed from thirty years ago, and now requires more dynamic and vibrant services that match the organized yet ad hoc nature presented by both modern business practice, and particularly its manifestations on the Internet. The Internet is re-writing the rules on how people interact, buy and sell, and exchange goods and services. In particular the Internet is showing us that EDI is not only relevant for business-to-business communications. The same concepts are also relevant for all consumer-to-supplier relationships, whether the consumer is an end-user, a manufacturer, a service organization such as a hospital or a hotel, a governmental organization or a virtual organization.

Electronic Commerce in 1997

Electronic business transactions

With the arrival of the Internet in the last decade of the 20th century the pattern of electronic commerce has dramatically changed. In particular, the Internet has introduced many new ways of trading, allowing interaction between groups that previously could not economically afford to trade with one another.

Whereas previously commercial data interchange involved mainly the movement of data fields from one computer to another, without human intervention, the new model for web-based commerce introduced by the Internet is typically dependent on human interaction for the transaction to take place. The new model is based principally on the use of interactive selection of a set of options, and on the completion of "electronic forms", to specify user requirements.

As this new model develops there has been a fundamental shift in how data used for commerce should be processed. The original create-->transmit-->receive-->process cycle of information processing, using individual programs, is beginning to be replaced by the concept of active objects which have inherent processes associated with them, based on the class of information they contain. Today an invoice may no longer contain a copy of the information stored in the database it was generated from: instead it contains a pointer that says where it expects to get the data from, and this data will be fetched from its managed source each time the invoice is processed.

Such interactive programs require us to review the underlying philosophy of electronic commerce. What are the characteristics of a system designed for "electronic business transactions" in an international marketplace?

To be truly interactive you need to be able to:

  1. Understand the business concepts represented in the interchanged data.
  2. Apply business-specific rules to the interchanged data to identify what class(es) of data it contains and formulate appropriate responses.

To do this you need to be able to:

  • identify the role and syntax of each piece of interchanged data
  • identify the source of each shared piece of information
  • identify which pieces of information should occur in each interchanged set of data and, if relevant, the order in which they occur in a particular message stream
  • identify who is responsible for creating, transmitting, receiving and processing each message, and which programs should be used to control each of these processes
  • identify when a message should be moved from one stage of the interchange process to another
  • identify which rules should be used to check that the relevant forms of interchange have taken place and to move data from one presentation template to another.

Because these interactions can be complex, and potentially require specialized knowledge, the rule templates can be supplemented by XML/EDI data manipulation agents (DataBots) to ensure that users can express their requirements in high-level, natural language, terms. DataBots automatically create appropriate rule templates and XML syntax to match user requirements and broker the entire interchange.

When DataBots are being used XML/EDI is identified as being robot generated by adding an R to its name to become XML/EDI-R.

At this point in time the ECMAScript subset of the Java programming language provides the vehicle that permits the DataBots to be deployed and received along with XML/EDI messages.