RELEASE 1.00
SAE ATIS DataBase GUIDE DOCUMENT
A training document developed to teach the
ATIS message set standards
Society of Automotive Engineers
400 Commonwealth Drive
Warrendale, PA 15096-0001
This Guide is available for download at:
- Link tbd
The Primary ATIS Users Guide can be found at
Current technical support and information exchange about using the ATIS standards in deployment and with other ITS standards can be obtained at the ITS Standards Forum, a community resource available
There is a set of forums dedicated to ATIS related issues at
ATIS Database Guide – Table of Contents
Forward
Part I Considerations when designing ATIS databases
An Overview of the DB translation process
Goals and Steps in the Process
Mapping the Event Message
Customizing ATIS for Local Use
Other Supporting Guides
Part II Considerations for implementing ITIS in legacy systems
ITIS Codes, a brief history
Major and Minor Legacy Phrases
Goals and Steps in the Process
Some Real World Examples in use
A Harder Translation case
Issues of Specificity
Final Thoughts
Part III Implementing filtering of ATIS messages
Requesting Information ATIS
The Information Request Message itself, a review
The Filter Element
The Data Types to Filter on
Individual ITIS groupings of data types
Building Complex Queries using the filter elements
A Last Word of Advice
Part IV Supporting Work
The ATIS schema, reduced
An Example WSDL system
Forward
Unlike the style used in the other “mini” ATIS guides, this guide is not divided into a first part dealing with abstract concepts, followed by a 2nd part with concrete working XML examples of those concepts. By comparison, this guide is more abstract due in part to the reality that each deployment tends to have its own wants and needs when creating an internal database for its own uses. The ITS standards process itself, focusing on interfaces rather than systems or black boxes, provides little insight in this area, although frequently there is an unstated common set of assumptions regarding what the general shape and structure which the supporting tables would be. Various “starting points” taken from the design patterns discussed here can be found on-line and downloaded into popular tools,[1] but these are simply rough sketches and you would need to invest considerable effort to use them in a product.
In order to best benefit from this document, the reader should have some basic familiarity with both XML schemas (XSDs) and with practical database applications such as those found in Microsoft Access, SQL, or MySQL. The concepts discussed in this paper can be implemented in any of these, and other, popular DB systems. The key concept of using a foreign key to index between tables as a means to resolve nested structures should be reviewed if needed because it is frequently used here. Because of the optional presence of many repeating structures[2] in the XML message themselves, it is not possible to reduce the typical XML message in ATIS into the simpler Nth normal form styles[3] of database normalization either. The general concept of referential integrity should also be reviewed if needed,[4] and we will generally presume that the tables we construct here can be used in this way.
Part I Considerations when designing ATIS databases
Part I of this guide covers design issues to be considered when creating a database structure that will store ATIS style XML messages as records. A general approach to mapping is shown that will support any valid ATIS message and which is applied to the event message as an example. Part II of this guide deals with the practical mapping of legacy descriptive codes and phrases about incidents and events into the national standard ITIS codes and how a local mapping set can be built and managed. Part III of this guide examines the issues of filtering and subscriptions for messages. The elements of the ATIS message set information request are translated into appropriate queries for use in SQL and other similar languages to extract the needed message records based on user requests. Part IV of this guide provides links to supporting resources.
An Overview of the DB translation process
When developing a database, the goal of deployment developers is to design a database in the preferred technology (MS Access, SQL etc.) that faithfully reflects the ATIS message content that will be handed. The database itself typically resides on a server of some sort, and this is used to manage the active information into message content and allow forming of the data out to others based on requests. Some deployments, aware of the self imposed limits[5] that they will use when creating such messages, can “cut corners” and make smaller, simpler sets of supporting tables that meet all the design requirements. Other deployments, especially those that seek to be able to accept any valid ATIS message, must do more. Typically this latter case is found with data repositories that seek to support wider regional deployments where multiple data sources may have minor variance from the national standards to support local needs. Regrettably, there is no exact or 1:1 methodology to “map” a database design to an XML message set which will please all. Rather, many minor variations exist with differing pros and cons and the decision process involves a considerable amount of planning regarding the precise uses proposed to be done with the resulting database. The goal of this section of the document is to review the various mapping approaches a deployment must consider and recommend choices for some of the more major decision points.
Many commercial tools exist today to automatically convert and translate XML between databases. The vast majority of these will produce tables and code stubs that at first glance may be overwhelming and will also most certainly be sub-optimal to your stated requirements. Most database technologies today have good support for XML as well, although none follow the styles[6] found in ITS in any exact way. Most deployments that this author has dealt with will use such tools to auto-generate a starting point (which is in itself very educational and highly recommended) and then, once the tool has shown what default naming and styles it comes up with, go back and hand re-edit the tables and relationships created to reflect a greater understanding of how the data will be used. To effectively use such tools you will need to spend a considerable amount of time mapping the XSD of the message set to the tables you choose to develop. If conditions allow, using the naming defaults that such tools impose for tables, rows, and columns can save considerable time in coding and debugging later. Some of the more complex tools (such as MapForce[7]) allow you to manage the mapping set, adding translation details to a basic project to suit your needs, and can then export it into multiple code-based libraries (C and Java) or XST style type systems. A tool like this can also be of great value if you have a legacy message in XML that you need to convert to a standards conformant style.
Let us begin the discussion by dismissing the novice’s question, why can’t we just develop a simple flat (rectangular) table and store our information in that? You could, if your data is very very simple, and you will not be getting data from anyone else, and you promised not to change your mind later. Very few people can meet these criteria. Under such conditions, you might be able to decide upon a more “rectangular” form for the data structures, and successfully render this into a table. It is much more typical to concede that there will be times when an unknown number of nested elements may be present in the message and that the need to support such messages will drive your database design to use sets of tables to reflect the arbitrary complexity of the nesting. Once you have made the initial leap that these nested tables are a fundamental part of the database design, use the native ability of your chosen database technology to its fullest and craft tables that will allow you to easily re-join the contents to create complete XML messages. Careful table design will allow rapid message reconstruction, as well as effective searching of the contents needed for supporting filters (considered in part III).
When approaching the design of nested tables the most critical design aspect is the way in which the various complementary tables are joined to each other. In an ITS type system, where the relationships are simply used to reflect the full content of a message, a single direction of linking is normally sufficient. This is typical done by replacing any sub-structure from the message with a reference key value that can be used to select the substructure contents found in another table. This is called a foreign key in most database systems. It is vitally important that you understand how this process works in the database technology which you select. Many database technologies have simple ways to provide a unique key for every record in a table, and this can be used to ensure that the integrity of your record set is maintained. If the mapping involves multiple table entries, then the effective key may be this unique value combined with other data to form the key (such as an index or sequence value). The relationships between the keys and the tables that use them will ultimately set the practical limits of what your database can do in terms of data support and filtering abilities.
TIP: It is generally a very good idea to have the database technology itself provide the unique key for any message or entry and NOT to use any data element in the content model of the message for this purpose. This avoids key collisions that can easily be fatal to your design. Even apparently unique combinations such as an event ID coupled to an issuing agency may not in practice always result in unique values. As a database designer, never trust implied relationships in the incoming data to hold true.
Goals and Steps in the Process
With the above description of goals and issues, a general process will now be outlined to convert the XML structures of the message set into database table forms.
- You will need to map the complete XML message in most cases. This means starting with the topmost atis:atisMessage or the informationResponse elements from the schema. However, most deployments are more interested in the information at the ResponseGroup level and prefer to start the design there.
- For each “chuck” of useful data (in this case for each data frame; if you start with the ResponseGroup level, then for each item like the event data frame) you will develop a detailed table in the DB that can either hold (i.e. flatten and include) or link-to (i.e. have a pointer to another table) that represents the data conveyed in the XML schema structure. Each element in the data frame becomes one or more columns in this table.
- For each element in the structure the design choice is fundamentally either to flatten and expand it or to link to it in another table devoted to that particular sub-structure. Making this choice is the hard part.
- For the simple elements, i.e. those with only one instance of short data (Text, Integers, Booleans, etc.), you simply flatten the XML data element into a similar representational form in the base. In this way an INT (0.255) becomes an unsigned byte in the DB and a ten character length limited string in the XML becomes a similar 10 character string in the DB, etc. Be sure that any character lengths in XML are expanded in the DB as needed for cases like “<” etc.
- For longer string instances, the precise DB technology used may affect the choice. The “memo field” method may be employed when long strings potentially are to be found. If the deployment decides to curtail the length of such a field from that found in the standard, the local variant of the schema being used should reflect this (by having the length facet changed).
- For structures which are complex (i.e. indented, and/or with further nested content), or which may occur multiple times in a sequence style, then linking to another table becomes the only practical option. This is also a good way to separate design concepts of other groups working on a different part of the overall design (such as LRMS profiles). When you don’t know how may items may occur, but you will be allowing the possibility of a great many, link to it. If such a structure is optional in the XSD, then typically the link key is left as zero to indicate the lack of the structure in an instance. Note that in this system you will have ONE key that may point to multiple entries in the other (foreign) table. In cases with multiple items in a sequence, the original order is often semantically important, so your tables should preserve this as well. Typically this is done with in index value in the table.[8]
- If the object in question is merely a single instance of another structure which is itself simple (just a sequence of other simple content), then it may be more effective to simply flatten it in place. Excessive use of this can cause large tables.
- If the object in question is merely a very short sequence of another structure which is itself simple (just a sequence of other simple content) then it may be more effective to simply flatten it in place.[9]
- If the same structure is reused multiple times in multiple places (such as the Head or ITIScodesAndText elements), then it is best to make this a table. How each instance is linked to in such a common table may vary (using different key systems) depending on your sorting needs.
- For each element that will be handled in another table, you need to consider structure, indexing, and any ordering needs when more than one instance is mapped. Repeat these rules for that table. The precise way in which the two tables will share a relationship is the key design goal to nail down. Typically, a key will be kept in the “top” table that can be used to extract the needed content in the “lower” table and allow join operations. When this key is empty or null, the element in question does not exist.
- A few elements will need to be stored in more than one form of representation to aid in sorting and filtering the data later.[10] The alternative forms of such elements need to be considered and space in the table structures allocated at this time. For the event message example, this is limited to reducing the ITIS phrases into some of the numerical equivalents.[11] Other similar needs may involve various schemes to pre-sort LRMS profiles or to reduce the size of text storage to indexed values. Computing and storing security hashes is also a typical consideration here.
- Some elements (or groups of elements) you may choose to store as “blobs” of well-formed XML fragments. This occurs in two typical cases: first, when the entire message is to received from some other source, but some of the content is to be processed in a later phase of the project. The goal is to be able to acquire the content, but defer processing until resources allow. The second case is that of a data exchange center dealing with local element content it may not understand (or even have a schema for). In such a system[12] it is reasonable to store the local content as a blob and simply pass it on. Of course, this type of handling generally precludes any filtering on the content.
- Finally, some content in the structure may be lost and not stored because it is not needed by the receiving system. Specialty ADUS users are good examples of this.
The above steps must be performed for each level of the schema that will be stored. Once you have a conceptual design of the database itself, this needs to be mapped to the XSD files. This mapping process will inevitably involve some name translations and a few data model translations (such as converting the ITIS codes to their numerical equivalents). When you have completed this, the tools you are using will typically also be able to provide a great amount of customized “canned code” that will in turn implement this mapping into and out of the DB you have selected.
TIP: If you have not worked with complex data structures rendered into table forms before (and therefore are not comfortable with foreign key use), it is strongly suggested you spend a day or two playing with the output products that many tools can automatically produce from the XML schema links in the back of this document. Once you have seen first hand how the linking process works, you will then be in a position to design a table to fit your own project needs.