Guide to WMO Table Driven Code Forms:

FM 94 BUFR

and

FM 95 CREX

Layer 1:Basic Aspects of BUFR and CREX

and

Layer 2:Layout, Functionality and Application of BUFR and CREX

Geneva, 1 January 2002

Preface

This guide has been prepared to assist experts who wish to use the WMO Table Driven Data Representation Forms BUFR and CREX.

This guide is designed in three layers to accommodate users who require different levels of understanding.

Layer 1 is a general description designed for those who need to become familiar with the table driven code forms but do not need a detailed understanding. Layer 2 focuses on the functionality and application of BUFR and CREX, and is intended for those who must use software that encodes and/or decodes BUFR or CREX, but will not actually write the software.

Layer 3 is intended for those who must actually write BUFR or CREX encoding and/or decoding software, although those wishing to study table driven codes in depth, will find it equally useful.

The WMO gratefully acknowledges the contributions of the experts who developed this guidance material. The Guide was prepared by Dr. Clifford H. Dey of the U. S. A. National Centre for Environmental Prediction. Contributions were also received in particular from Charles Sanders - Australia, Eva Cervena - Czech Republic, Chris Long - U.K., Jeff Ator - USA and Milan Dragosavac, ECMWF.

Contents

Layer 1:Basic Aspects of BUFR and CREX

Page

1.1Overview...... L1- 2

1.2General Description...... L1- 2

1.2.1Self-description...... L1- 2

1.2.2Code Structures...... L1- 4

1.2.3BUFR and CREX Tables...... L1- 5

1.2.4Features common to BUFR and CREX...... L1- 8

1.2.5Differences...... L1-10

1.2.6CREX Examples...... L1-11

1.3Updating Procedures...... L1-15

1.3.1General Procedures...... L1-16

1.3.2Updating the Structures...... L1-16

1.3.3Updating the Tables...... L1-16

1.3.4Validation of Updates...... L1-16

1.4Migration Guidance...... L1-17

1.4.1Training...... L1-17

1.4.2Technical Issues...... L1-17

1.4.3Encoding vs. interpretation...... L1-18

Layer 2:Layout, Functionality and Application of BUFR and CREX..L2- 1

Layer 3:Detailed Description of the Code Forms

(See separate Volume Layer 3 for programmers of encoder/decoder software)

Layer 1:Basic Aspects of BUFR and CREX

1.1Overview

The table driven code forms BUFR (Binary Universal Form for the Representation of meteorological data) and CREX (Character form for the Representation and EXchange of data) offer the great advantages of flexibility and expandability compared with the traditional alphanumeric code forms. These beneficial attributes arise because BUFR and CREX are self-descriptive. The term "self-descriptive" means that the form and content of the data contained within a BUFR or CREX message are described within the BUFR or CREX message itself. In addition, BUFR offers condensation, or packing, while the alphanumeric code CREX provides human readability.

BUFR was first approved for operational use in 1988. Since that time, it has been used for satellite, aircraft, wind profiler, and tropical cyclone observations, as well as for archiving of all types of observational data. In 1994, CREX was approved as an experimental code form by the WMO Commission on Basic Systems (CBS Ext.94). In 1998, CBS (CBS-Ext. 98) recommended CREX be approved as an operational data representation code form as from 3 May 2000. In 1999, this recommendation was endorsed by the WMO Executive Council (EC-LI (1999)). CREX is already used among centres for exchange of ozone, radiological, hydrological, tide gauge, tropical cyclone, and soil temperature data. BUFR should always be the first choice for the international exchange of observational data. CREX should be used only when BUFR cannot. BUFR and CREX are the only code forms the WMO needs for the representation and exchange of observational data and are recommended for all present and future WMO applications.

This guide to Table Driven Code Forms is designed in three layers to accommodate users who require different levels of understanding. Layer 1 is a general description designed for those who need to become familiar with the table driven code forms but do not need a detailed understanding. Layer 2 focuses on the functionality and application of BUFR and CREX, and is intended for those who must use software that encodes and/or decodes BUFR or CREX, but will not actually write the software. Layer 3 is intended for those who must actually write BUFR or CREX encoding and/or decoding software, although those wishing to study table driven codes in depth, will find it equally useful.

1.2General Description

1.2.1Self-definition

How do we know what the following character string means in an alphanumeric code?:

3232511027?

First, we need to know the code form within which this character string falls. We assume it comes from a bulletin of synoptic observation reports, thus the code form is FM 12 SYNOP. Second, we need to know the position within the SYNOP code form of the two groups above (the second and third mandatory groups in Section 1). Third, we need to refer to the WMO Manual on Codes, Volume I.1 (International Codes), Part A (Alphanumeric Codes) for the description of these two groups in the SYNOP code form (unless we have committed the SYNOP code form to memory). Upon doing this, we find the two groups above have the following symbolic form:

Nddff1snTTT ,

where N = total cloud cover, dd = wind direction, ff = wind speed, 1 is a group indicator, and TTT = air temperature, where the sign of TTT is given by sn. However, only after looking further at the code book to find the full meanings and coding conventions of this symbolic form, can we determine that the sky is 3/8 covered with clouds, the wind is blowing from 230 degrees at 25 knots, and the air temperature is - 2.7 oC. Thus, the position within the report and the coding convention (in this example, the symbolic form Nddff 1snTTT) assigned to that position of the report define the data contained within traditional alphanumeric code forms. Furthermore, if a new group of information were to be inserted before the second and third mandatory groups in Section 1, the positions of these two groups would change. Such a modification would require a corresponding update to all software programs that encode or decode such reports or the software would either give incorrect values or fail completely. The reason is that the coding conventions used to describe the data are built into the processing software, not included with the data. It is this fact that renders the traditional alphanumeric code forms incapable of accommodating new types of data.

In a table driven code form, there are also position rules, but they apply only to the shape of the «container» (or code structure) rather than to the content of the «container». The presence and form of the data are described within the «container» itself. This is the concept of self-description. In order to accomplish it, there is a section (the Data Description Section) in BUFR and CREX messages in which the type and form of the data contained within the message are defined. Here is an example of a simple self-described message:

Data Description:

Position:ElementParameter UnitData

Reference Name Width

Number (characters)

1B 01 001Block numberNumeric2

2B 01 002Station number Numeric3

3B 04 004HourHour2

4B 12 001TemperatureTenth °C3

5B 11 002Wind Speedm/sec.3

6B 11 003Wind directionDegree3

Data:

07 444 06 154 003 230

We can see here that the station is 07444, the hour is 06, the temperature is 15.4°C, the speed of wind is 3 meter/sec and its direction is 230 degree. The first section of the message contains the data description, which is in itself very long relative to the data values. To make this more efficient, standards (unit, data width, scale, etc..) for coding the values are defined for various physical parameters and kept in the WMO Code Tables. Thus, instead of writing all the detailed definitions within the message, one will just write a number (called above in this example: Element Reference Number) identifying the parameter with its descriptions. Then in that case the message would be:

Data Description: 001002 004004 012001 011002 011003

Data: 07444 06 154 003 230

In WMO table driven codes, the Data Description Section contains a sequence of data descriptors, which is like a set of "pointers" towards elements in predefined and internationally agreed tables (stored in the official WMO Manual on Codes). By definition these descriptors are six digits reference numbers (or six characters for CREX); they are defined in the code tables that are explained further in section 1.2.3 below. Once the Data Description Section is read, the following section containing the data itself (the Data Section), can be understood. Indeed, the characteristics of the parameters to be transmitted must already be defined in the tables of the WMO Manual before data containing those parameters can be exchanged in BUFR or CREX messages.

1.2.2Code Structures

The structures of the BUFR and CREX code forms are the following:

BUFR

SECTION 0 Indicator SectionSECTION 1 Identification Section SECTION 2 (Optional Section) SECTION 3 Data Description Section

SECTION 4 Data SectionSECTION 5 End Section

CREX

SECTION 0 Indicator SectionSECTION 1 Data DescriptionSection SECTION 2 Data Section SECTION 3 (Optional Section) SECTION 4 End Section

The Indicator Sections and the BUFR Identification Section are short sections, which identify the message. The list of descriptors, pointing towards elements in predefined and internationally agreed tables that are stored in the official WMO Manual on Codes (described previously), are contained in the Data Description Section. These descriptors describe the type of data contained in the Data Section and the order in which the data appear there. The Optional Section can be used to transmit any information or parameters for national purpose. The End Section contains the four alphanumeric characters "7777" to denote the end of the BUFR or CREX message.

Since the data in a CREX message are laid out one after the other, and since the data values of the parameters in a CREX message are transmitted in a set of characters, it is very simple to read a CREX message. While the order of the data contained in a BUFR message is likewise described by the BUFR Data Description Section, the data values of the parameters in a BUFR message are translated in a set of bits in BUFR. Consequently, a BUFR message is not human readable, or extremely difficult to decipher without the help of a computer program. CREX can be looked upon as the image in characters of BUFR bit fields.

When there is a requirement for transmission of new parameters or new data types, new elements are simply added to the WMO BUFR and CREX tables, after approval by the CBS. Since table driven code forms can thus describe any new parameter by the simple addition of a new entry to the appropriate code table, table driven code forms possess the flexibility to transmit an infinite variety of information. Therefore, definition of new «code forms» is no longer necessary. Furthermore, procedures and regulations are fixed. A new edition number is assigned every time the BUFR or CREX code structure is changed. Although these edition changes require an update to BUFR or CREX encoding or decoding software, such changes are infrequent (the BUFR Edition Number has changed only twice since 1988 – see Section 1.3). Likewise, a new version number is assigned every time additions are made to BUFR or CREX code tables. Although version number changes are more frequent than edition number changes, they do not require modifications to the processing software. The edition number of the format (structure of the message) and version number of the tables are transmitted in the message itself (in the Indicator and Identification sections for BUFR, in the Data Description section for CREX) and enable the treatment of old archived data.

1.2.3BUFR and CREX Tables

Tables define how the parameters (or elements) shall be coded as data items in a BUFR or CREX message (i.e. units, size, scale). They are recorded in the WMO Manual on Codes, Volume I.2 (International Codes), Parts B (Binary Codes) and C (Common Features to Binary and Alphanumeric Codes). The Manual on Codes also comprises Volume I.1 (international Codes), Part A (Alphanumeric Codes) and Volume II: Regional Codes and National Coding Practices. These three volumes are collectively referred to as WMO Publication No. 306. The Tables defining BUFR and CREX coding are Tables A, B, C, and D.

Table A subdivides data into a number of discrete categories (e.g. Surface data – land, Surface data - sea, Vertical soundings (other than satellite), Vertical soundings (satellite), etc.). While not technically essential for BUFR or CREX encoding/decoding systems, the data categories in Table A are useful for telecommunications purposes and for storage of data in and retrieval of data from a data base.

Table B describes how individual parameters, or elements, are to be encoded and decoded in BUFR and CREX. For each element, the table lists the reference number (or element descriptor number, which is used in the description section of the code like a "pointer", as explained earlier), the element name, and the information needed to encode or decode the element. For BUFR, this information consists of the units to be used, scale and reference values to apply to the element, and the number of bits used to describe the value of the element (the BUFR data width). For CREX, this information consists of units to be used, the scale value to apply to the value of the element, and the number of characters used to describe the value of the element (the CREX data width). Although the same elements are found in both BUFR and CREX Tables B, their unit may differ (BUFR units are SI, while CREX units are more user oriented). For example, the unit used for temperature is Kelvin in BUFR but Celsius in CREX. The data items transmitted in a report will have their descriptor numbers listed in the Data Description Section. As an example, extracts of BUFR and CREX Table B for Temperature is given below.

Table B is fundamental to encoding and decoding in both BUFR and CREX.

L1- 1

Class 12 - Temperature

TABLE REFERENCE / TABLE
ELEMENT NAME / BUFR / CREX
UNIT / SCALE / REFERENCE VALUE / DATA WIDTH (Bits) / UNIT / SCALE / DATA
WIDTH
(Characters)
F / X / Y
0 / 12 / 001 / Temperature/dry-bulb temperature / K / 1 / 0 / 12 / °C / 1 / 3
0 / 12 / 002 / Wet-bulb temperature / K / 1 / 0 / 12 / °C / 1 / 3
0 / 12 / 003 / Dew-point temperature / K / 1 / 0 / 12 / °C / 1 / 3
0 / 12 / 004 / Dry-bulb temperature at 2 m / K / 1 / 0 / 12 / °C / 1 / 3
0 / 12 / 005 / Wet-bulb temperature at 2 m / K / 1 / 0 / 12 / °C / 1 / 3
0 / 12 / 006 / Dew-point temperature at 2 m / K / 1 / 0 / 12 / °C / 1 / 3
0 / 12 / 007 / Virtual temperature / K / 1 / 0 / 12 / °C / 1 / 3
0 / 12 / 011 / Maximum temperature, at height and over period specified / K / 1 / 0 / 12 / °C / 1 / 3
0 / 12 / 012 / Minimum temperature, at height and over period specified / K / 1 / 0 / 12 / °C / 1 / 3

Note:To encode values in BUFR, the data (in the units as specified in the UNIT column) must be multiplied by 10 to the power of SCALE and then, the REFERENCE VALUE must be subtracted from them. In the example above, data will be thus encoded in 10th of Degree Kelvin in BUFR.

To encode values in CREX, the data (in the units as specified in the UNIT column) must be multiplied by 10 to the power of SCALE. In the example above, data will be thus encoded in 10th of Degree Celsius in CREX.

L1- 1

TABLE C defines a number of operations that can be applied to the elements. Each such operation is assigned an operator descriptor. For example, BUFR Table C contains operator descriptors to change the scale value, the reference value, or data width listed for a parameter in BUFR Table B. Some of the operations defined in BUFR Table C are quite complex. Operator descriptors are described in Layer 2 and at length in Layer 3. Operator descriptors are also available in CREX, although their number and usage is rather limited.

Operator descriptors, although not essential for BUFR and CREX encoding and decoding, are useful in minimizing the number of new table entries and including quality assessment information.

TABLE D defines groups of elements that are always transmitted together (like a regular SYNOP or TEMP report) in what is called a common sequence. By using a common sequence descriptor, the individual element descriptors will not need to be listed each time in the data description section. This will reduce the amount of space required for a BUFR or CREX message. Common sequences are defined in BUFR and CREX Tables D. An example of BUFR Table D is shown below.

Sequence descriptors, although not essential for BUFR and CREX encoding and decoding, are useful in decreasing the space requirements for BUFR and CREX messages.

Meteorological sequences common to surface data

TABLE
REFERENCE / TABLE
REFERENCES / ELEMENT NAME
F / X / Y
3 / 02 / 001 / 0 / 10 / 004 / Pressure (at station level)
0 / 10 / 051 / Pressure reduced to mean sea level
0 / 10 / 061 / 3-hour pressure change
0 / 10 / 063 / Characteristic of pressure tendency
(High altitude station)
3 / 02 / 002 / 0 / 10 / 004 / Pressure (at station level)
0 / 07 / 004 / Pressure level
0 / 10 / 003 / Geopotential of pressure level
0 / 10 / 061 / 3-hour pressure change
0 / 10 / 063 / Characteristic of pressure tendency
3 / 02 / 003 / 0 / 11 / 011 / Wind direction (10 m)
0 / 11 / 012 / Wind speed (10 m)
0 / 12 / 004 / Temperature (2 m)
0 / 12 / 006 / Dew point (2 m)
0 / 13 / 003 / Relative humidity
0 / 20 / 001 / Horizontal visibility
0 / 20 / 003 / Present weather
0 / 20 / 004 / Past weather (1)
0 / 20 / 005 / Past weather (2)

1.2.4Features common to BUFR and CREX

Structure: CREX was intentionally designed to be an alphanumeric version of BUFR. It is therefore not surprising that the CREX and BUFR code forms have many structural similarities. Both achieve self-definition by including a section within each message describing the form and content of the data included within that message. Both BUFR and CREX messages begin with an alphanumeric representation of the name of the code form, both have optional sections, and both have identical End Sections.

Tables: Table A is identical for BUFR and CREX. Furthermore, BUFR and CREX define the same set of elements using nearly identical descriptors - the first value in the descriptor, denoting the descriptor type, is binary in BUFR and alphanumeric in CREX, but the remainder of the descriptors are identical for identical elements. This made it possible to design a single Table B to serve both code forms. Finally, although BUFR and CREX Tables D are different, they are closely co-ordinated. Common sequences that can be transformed easily between BUFR and CREX are not defined in both BUFR and CREX Table D. If a CREX Table D sequence is not defined in BUFR Table D, it has a number that is not used by any other BUFR sequence. Similarly, BUFR Table D sequences without CREX counterparts have numbers that are not used by any CREX Table D. In Tables A, B and D there are ranges of numbers for descriptors outside the internationally agreed range of numbers. These can be used to define special descriptors for national or local purposes and thus enable the domestic exchange of special national data.