© ISO/IECISO/IEC 14496-1:2001/Amd.3

ORGANISATION INTERNATIONALE NORMALISATION

ISO/IEC JTC 1/SC 29/WG 11

CODING OF MOVING PICTURES AND AUDIO

ISO/IEC JTC 1/SC 29/WG11 N3725

October 2000

Source: / MPEG-4 Systems
Title: / ISO/IEC 14496-1:2001/PDAM 3
Editor: / Michelle Kim (IBM)
Co-editors: / Mikael Bourges-Sevenier (iVast), Rich Rafey (Sony), Steve Wood (IBM)
Status: / Approved

Contents

Introduction

Organization of this document

1Overview of the XMT Framework

1.1Interoperability of XMT

1.2Two-tier Architecture: XMT-A and XMT-Ω Formats

2XMT-A Format

2.1Introduction

2.2XMT-A Document structure

2.3XMT-A Representation of Nodes

2.3.1Overview

2.3.2XMT-A node elements

2.3.3Schema and XML examples

2.4XMT-A Routing

2.4.1<ROUTE>

2.5XMT-A Timing

2.5.1<par>

2.6XMT-A Representation of BIFS Commands

2.6.1Overview

2.6.2<Insert>

2.6.3<Delete>

2.6.4<Replace>

2.7XMT-A Representation of Object Descriptors

2.7.1Overview

2.7.2<Object Descriptor>

2.7.3<InitialObjectDescriptor>

2.7.4<ES_Descriptor>

2.7.5<DecoderConfigDescriptor>

2.7.6Decoder Specific Info

2.7.7<SLConfigDescriptor>

2.7.8<ContentIdentificationDescriptor>

2.7.9<SupplementaryContentIndentificationDescriptor>

2.7.10<IPI_DescrPointer>

2.7.11<IPMP_DescriptorPointer>

2.7.12<IPMP_Descriptor>

2.7.13<QoS_Descriptor>

2.7.14<ExtensionDescriptor>

2.7.15<RegistrationDescriptor>

2.7.16Object Content Information Descriptors

2.7.17<ExtensionProfileLevelDescriptor>

2.7.18<SegmentDescriptor>

2.7.19Object Descriptor Commands

2.8XMT-A IPMP Streams

2.9XMT-A OCI Streams

2.10XMT-A MPEG-J Streams

2.11XMT-A Elementary Stream Data

2.11.1<StreamSource>

2.11.2Object Descriptor Streams

2.11.3Clock Reference Streams

2.11.4Scene Description Streams

2.11.5IPMP Streams

2.11.6Visual Streams

2.11.7Audio Streams

2.11.8MPEG-7 Streams

2.11.9IPMP Streams

2.11.10OCI Streams

2.11.11MPEG-J Streams

2.12XMT-A Deterministic mapping

2.13XMT-A Animation

2.13.1Overview

2.13.2Animation Nodes

2.14XMT-A <authoring> Element

2.14.1Specifying a <metaSet> of elements

2.14.2Specifying an additional <metaProperty>

3XMT-Ω Format

3.1Re-Using SMIL for XMT-Ω

3.1.1Re-using SMIL Modules

3.2XMT-Ω Data Types

3.2.1Basic data types

3.3Extensible Media (xMedia) Objects

3.3.1Generic xMedia object

3.3.2<rectangle>

3.3.3<circle>

3.3.4<text>

3.3.5<string>

3.3.6<subtitles>

3.3.7<img>

3.3.8<video>

3.3.9<points>

3.3.10<lines>

3.3.11<polygons>

3.3.12<curve>

3.3.13<audio>

3.3.14<audioclip>

3.3.15<box>

3.3.16<cone>

3.3.17<cylinder>

3.3.18<elevationgrid>

3.3.19<sphere>

3.3.20<mesh>

3.3.21<inline>

3.3.22<applicationwindow>

3.4Timing

3.4.1<par>

3.4.2<seq>

3.4.3<excl>

3.4.4XMT-Ω elements and timing attributes

3.5Animation

3.5.1<animate>

3.5.2<set>

3.5.3<animateMotion>

3.5.4<animateColor>

3.5.5Mapping animation to MPEG-4

3.6Interactive behaviours

3.6.1User interaction

3.6.2XMT events

3.7Structure

3.8Scene level objects

3.9Escape from XMT-Ω to XMT-A: the <scene> element

3.9.1Functionality and semantics

3.9.2Namespace and DTD

3.9.3XMT-A features

3.10Grouping

3.10.1<group>

3.11Content Control

3.11.1<switch>

3.12Linking

3.12.1<a>

3.13MPEG-J

3.14MPEG-7

3.15MetaInformation

3.16Layout

3.17Transitions

3.18Encoding hints and Delivery hints

3.18.1<EncodingHints>

3.18.2<Delivery hints>

3.19Miscellaneous

3.20Examples

3.20.1Circle with finite duration color animation on mouse press

3.20.2An example to show BIFS, OD and media stream mapping

3.21Node coverage

4References

Annex A

Introduction

This document is a working draft of the ISO/IEC 14496-1 AMD 3. This document specifies:

  1. Extensible MPEG-4 Textual Format (XMT).

Organization of this document

This document is organized as follow:

Clause 1 provides an overview of the Extensible MPEG-4 Textual Format (XMT) Framework;
Clause 2 specifies the XMT-A construct,
Clause 3 specifies the XMT-Ω construct.

Annex A contains the XMT-A schema.

Information technology – Coding of audio-visual objects –

Part 1: Systems

AMENDMENT 3: Textual format (XMT)

1Overview of the XMT Framework

The Extensible MPEG-4 Textual format (XMT) is a framework for representing MPEG-4 scene description using a textual syntax. The XMT allows the content authors to exchange their content with other authors, tools or service providers, and facilitates interoperability with both the Extensible 3D (X3D) being developed by the Web3D Consortium, and the Synchronized Multimedia Integration Language (SMIL) from the W3C consortium.

1.1Interoperability of XMT

The XMT format can be interchangeable between SMIL players, VRML players, and MPEG-4 players. The format can be parsed and played directly by a W3C SMIL player, preprocessed to Web3D X3D and played back by a VRML player, or compiled to an MPEG-4 representation such as mp4, which can then be played by an MPEG-4 player. See below for a graphical description of interoperability of the XMT.

1.2Two-tier Architecture: XMT-A and XMT-Ω Formats

The XMT framework consists of two levels of textual syntax and semantics: the XMT-A format and the XMT-Ω format, which we will abbreviate by A and Ω, respectively, and use them interchangeably where there is no confusion.

The XMT-A is an XML-based version of MPEG-4 content, which contains a subset of the X3D. Also contained in XMT-A is an MPEG-4 extension to the X3D to represent MPEG-4 specific features. The XMT-A provides a straightforward, one-to-one mapping between the textual and binary formats.

The XMT-Ω is a high-level abstraction of MPEG-4 features designed based on the W3C SMIL. The XMT provides a default mapping from Ω to A, for there is no deterministic mapping between the two, and it also provides content authors with an escape mechanism from Ω to A.

2XMT-A Format

2.1Introduction

This section contains the XMT-A format definition that has the goals of representing ISO/IEC 14496-1 binary constructs in a textual format, providing an optional one-to-one deterministic mapping to ISO/IEC 14496-1 binary coding and to be interoperable with the X3D. XMT-A is designed to be compatible with the XML representation in X3D to facilitate such interoperability; MPEG-4 specific features being additional to this representation.

2.2XMT-A Document structure

An XMT-A document has a single optional <Header> element followed by a single <Body> element. The <Header> element contains zero or more <meta> elements, as per X3D, and also contains the MPEG-4 specific element for the <InitialObjectDescriptor>.

Whereas an X3D document would now directly go into a <Scene> element, as MPEG-4 can carry many media streams and can dynamically update the BIFS. So, for MPEG-4, the <Scene> element is within the single <Body> element that holds the MPEG-4 representation of all the BIFS, commands, OD framework etc, inside the <Replace> command.

So MPEG-4 contains a <Header> with a <Body> and inside the <Body> is the <Replace<Scene> BIFS command. The table below compares X3D and MPEG-4 representation to illustrate the high degree of compatibility and the small amount of change to go from X3D to MPEG-4 or vice versa (within the subset of elements that is contained in both standards of course).

X3D / XMT-A
<Header>
<meta>
</meta>
</Header>
<Scene>
<!-- The scene contents -->
</Scene> / <Header>
<meta>
</meta>
<InitialObjectDescriptor/>
</Header>
<Body>
<Replace>
<Scene>
<!-- The scene contents -->
</Scene>
</Replace>
</Body>

To fully convert the document from X3D to XMT-A, or vice versa, the outer <X3D> or <XMT-A> element with schema namespace reference will need to be altered accordingly,

Note: X3D <Scene> does not need to have a <Group> at the top level, whilst MPEG-4 requires a top-level node such as <Group>, <OrderedGroup>, <Layer2D> or <Layer3D> as the root of the scene graph. If the X3D scene does not have a single <Group> as the root it will be also be necessary to add this when converting to XMT-A. Also X3D image, video and audio sources are referred directly by urls, and whilst MPEG-4 can express the urls in an identical manner it is more likely that a conversion would create ObjectDescriptors for these media types and replace the source url references by ObjectDescriptor Ids.

2.3XMT-A Representationof Nodes

2.3.1Overview

This section provides a description of the XMT-A textual representation of MPEG-4 nodes. This representation follows the same rules as X3D and hence is compatible with X3D. MPEG-4 adds to this representation some extra attributes and elements for deterministic binary encoding and to augment authoring. These extra attributes and elements are however optional.

2.3.2XMT-A node elements

2.3.2.1MPEG-4 node/field to XML element/attribute mapping algorithm

The following algorithm is used to convert MPEG-4 nodes and fields to XMT-A elements and attributes.

  1. Each node is converted to an XMT-A element, with its name preserved.
  2. For each field of a node
  3. If the field type is a node, i.e., the field can contain one or more children nodes, then the field is converted to an XMT-A element, with the element name identical to the field name. This element will appear as the child element.
  4. If the field type is non-node and is a plain Field or an exposedField, then the field is made into an attribute of the element, preserving its name. (Fields with eventIn and eventOut types are omitted as they are not encoded and this cannot usefully be attributes of the element.)

An exception to the above rule for node/non-node field conversion is for the <Conditional> node, where the buffer field, although a non-node field, is converted to an element so that it can contain one or more BIFS command elements in this XML representation .

A field without a default value is optional. Fields with MPEG-4 default binary values are given default XML attributes with the same values.

2.3.2.2Common attributes and elements

Optional DEF and USE attributes, as per X3D, are present on all XMT-A node elements. MPEG-4 adds the following optional common attributes of binaryID for deterministic binary encoding, useName to code id as name, and an authoring augmentation to form collections (sets) and assign extra properties within an authoring framework for use at authoring time.

For more detail on <authoring> element and it’s usage see section 2.14 on XMT-A <authoring> Element.

2.3.2.3Element and attribute type classifications

Node elements and field attribute types will be classified according to MPEG-4 system node types as per the coding tables of ISO/IEC 14496-1 and the amendments.

2.3.3Schema and XML examples

Given the algorithm described above MPEG-4 nodes can easily be converted into XML. This section provides some examples to illustrate the representation. The full set of nodes from ISO/IEC 14496-1 and the amendments can be converted this way. The Schema for XMT-A, containing the full set of nodes, can be found in XMT-A Schema.

The following example shows the MPEG-4 node Material converted to the XMT-A element <Material> (the XMT-A authoring augmentation constructs are shown too i.e. the authoring element and metaSetGroup). The Material node has no fields that are nodes and so all its fields have become attributes and DEF/USE is included as a predefined attribute group.

<element name="Material">

<complexType>

<all>

<element ref="xmta:authoring" minOccurs="0"/>

</all>

<attribute name="ambientIntensity" type="xmta:SFFloat" use="default" value="0.2"/>

<attribute name="diffuseColor" type="xmta:SFColor" use="default" value="0.8 0.8 0.8"/>

<attribute name="emissiveColor" type="xmta:SFColor" use="default" value="0 0 0"/>

<attribute name="shininess" type="xmta:SFFloat" use="default" value="0.2"/>

<attribute name="specularColor" type="xmta:SFColor" use="default" value="0 0 0"/>

<attribute name="transparency" type="xmta:SFFloat" use="default" value="0"/>

<attributeGroup ref="xmta:metaSetGroup"/>

<attributeGroup ref="xmta:DefUseGroup"/>

</complexType>

</element>

and some examples of its use.

<Material ambientIntensity="0.6" emissiveColor=”1.0 0.1 0.78”/>

<Material DEF=”ABlue” emissiveColor=”0.0 0.1 0.88”/>

<Material USE=”ABlue”/>

The following example shows the MPEG-4 node OrderedGroup converted to the XMT-A element <OrderedGroup>. The OrderedGroup node has the children fields that is of type multiple nodes and so that field is converted to an element whilst its other field (order) has become an attribute.

<element name="OrderedGroup">

<complexType>

<all>

<element name="children" minOccurs="0" form="qualified">

<complexType>

<choice minOccurs="0" maxOccurs="unbounded">

<group ref="xmta:SF3DNodesType"/>

</choice>

<attributeGroup ref="xmta:metaSetGroup"/>

</complexType>

</element>

<element ref="xmta:authoring" minOccurs="0"/>

</all>

<attribute name="order" type="xmta:MFFloat" use="optional"/>

<attributeGroup ref="xmta:metaSetGroup"/>

<attributeGroup ref="xmta:DefUseGroup"/>

</complexType>

</element>

and an example of its use (the Shapes are incomplete for clarity)

<OrderedGroup order=”1.2 6.5”>

<children>

<Shape>…<Shape/>

<Shape>…<Shape/>

</children>

</OrderedGroup>

2.4XMT-A Routing

2.4.1<ROUTE>

2.4.1.1Description

The <ROUTE> element is the XMT-A representation of the ROUTE as described in ISO/IEC 14496-1:1999. The optional id attribute names the ROUTE and allows the ROUTE to be deleted or replaced at a later time by referring to it via the atID attribute.

<element name="ROUTE">

<complexType>

<attribute name="binaryID" type="ID" use="optional"/>

<attribute name="atID" type="IDREF" use="optional"/>

<attribute name="ID" type="ID" use="optional"/>

<attribute name="fromNode" type="IDREF" use="required"/>

<attribute name="fromField" type="NMTOKEN" use="required"/>

<attribute name="toNode" type="IDREF" use="required"/>

<attribute name="toField" type="NMTOKEN" use="required"/>

<attributeGroup ref="xmta:DefUseGroup"/>

</complexType>

</element>

Like X3D, <ROUTE>s can be placed inside the <Scene> element before the closing </Scene> (In MPEG-4 <Scene> is nested inside <Replace> command to represent the binary MPEG-4 ReplaceScene command). Like X3D (and unlike VRML) <ROUTE>s cannot be included inside other elements of the scene.

Also MPEG-4 adds an ID and atID to support managing <ROUTE>s using the <Insert>, <Delete> and <Replace> commands where <ROUTE>s with id’s can be created and referenced later to be deleted or replaced.

2.5XMT-A Timing

The XMT-A uses one of the SMIL time containers, the <par> element, to group multiple commands.

2.5.1<par>

The XMT-A allows only the “begin” attribute on the <par> element to specify the execution (begin) time of commands. Moreover <par> elements can also contain other <par> elements and for the nested <par> elements their begin time is relative to the parent time container. There is an implied top level <par begin=”0.0”>. The <par> elements need not appear ordered in time, indeed nesting of <par> elements will often preclude this. Begin times shall be >= 0.0 seconds. The attribute begin has an SFTime type to maintain uniformity with other time fields, in MPEG-4 node elements such as <TimeSensor> and <MovieTexture>, within the scene.

The <par> element may contain

  • <par>
  • BIFS Commands
  • Object Descriptor Commands
  • IPMP Messages
  • OCI Events
  • MPEG-J Stream Headers

All BIFS commands, for a given BIFS stream, to be executed at a given time will be coded into a single CommandFrame (and hence a single AU) in the order the commands appear in the document.

All OD commands, for a given OD stream, to be executed at a given time will be coded into a single AU in the order the commands appear in the document.

All IPMP messages, for a given IPMP elementary stream, to be executed at a given time will be coded into a single AU in the order the messages appear in the document.

All OCI Events messages, for a given OCI elementary stream, to be executed at a given time will be coded into a single AU in the order the events appear in the document.

<par begin= “”>
<!-- Any number of commands/messages/events and/or <par> elements -->
</par>

2.6XMT-A Representation of BIFS Commands

2.6.1Overview

This section provides a detailed description of the XMT-A encoding of the MPEG-4 BIFS commands. Commands in BIFS are timed using <par> element construct.

There are three basic BIFS commands in XMT-A: <Insert>, <Delete>, <Replace>. The MPEG-4 ReplaceScene binary command is captured in XMT-A with <Replace> <Scene>… </Scene</Replace>. <Insert>, <Delete> and <Replace> commands can be used on nodes, values in multiple value fields, or routes. In addition <Replace> can act on a whole multiple value field. <Replace> <Scene> replaces the entire scene – both nodes and routes.

2.6.2<Insert>

Insert command provides for node, Indexed value and Route insertion. For Insert atField defaults to value ‘children’ and position defaults to value ‘END’ making it easy to add a Node to a group.

<Insert atES_ID=””
atNode="" atField="children" position="BEGIN | END | n" value=””>
<!-- Nodes (including sub-trees) may go here and/or Routes-->
</Insert>

2.6.2.1Insert Node

This is the Insert Node version of the Insert command. When atField=”children” (defaulted if attribute not present) the command will be encoded as BIFS Update for Insert Node by nodeID. When atField is any other MFNode field then it will be encoded as BIFS Update for Insert by IndexedValue.

<Insert atNode="" atField="children" position="BEGIN | END | n">
<!-- Node (tree) goes here -->
</Insert>

The following are examples of Insert used to insert a node

  • Inserts a new sub-tree at the END of a group

<Insert atNode="MyGroup">
<Group>
<children>...</children>
</Group>
</Insert>

  • the following is equivalent to above example (atField="children" is default)

<Insert atNode="MyGroup" atField="children">
<Group>
<children>...</children>
</Group>
</Insert>

  • Inserts a new shape at the END of a group (geometry and appearance content omitted for clarity)

<Insert atNode="MyGroup">
<Shape>
<geometry>...</geometry>
<appearance<Appearance DEF="MyShapeStyle">...</appearance>
</Shape>
</Insert>

  • Inserts 2 new shapes, one at BEGIN and other at the END of a group

<Insert atNode="MyGroup" position="BEGIN, END">
<Shape DEF="MyRect">
<geometry>...</geometry>
<appearance>...</appearance>
</Shape>
<Shape USE="MyRect"/>
</Insert>

2.6.2.2Insert Indexed Value

This is the InsertIndexedValue version of Insert for simple non-Node fields

<Insert atNode="" atField="" position="BEGIN | END | n" value=””/>

The following are examples of Insert used to insert simple field values (non-Node values)

  • Inserts a new colorIndex value 6 at the END of the colorIndex field

<Insert atNode="MyIndexedLineSet2D" atField="colorIndex" value="6" />

  • Inserts new colorIndex values 3,3,5,4,2 at positions 2,7,BEGIN, END and 4. Note that position 7 is the new position after the first value has been inserted at position 2 etc. Inserts are done in the order listed The following command will be encoded as 5 BIFS Update commands for IndexedValue field insertion)

<Insert atNode="MyIndexedLineSet2D" atField="colorIndex"
position="2, 7, BEGIN, END, 4"
value="3, 3, 5, 4, 2" />

2.6.2.3Insert Route

This is InsertRoute version of Insert

<Insert>
<ROUTE ID="" fromNode="" fromField="" toNode="" toField=""/>
</Insert>

The following are examples of Insert used to insert a Route

  • Inserts 2 routes one without an ID and another with an ID for potential later deletion/replacement

<Insert>
<ROUTE fromNode="WhiteRect" fromField="emissiveColor"
toNode="ACircle" toField="emissiveColor"/>
<ROUTE ID="BlueRoute"
fromNode="BlueRect" fromField="emissiveColor"
toNode="ASquare" toField="emissiveColor"/>
</Insert>

As ROUTE insertion is not dependent in anyway on the attributes of the Insert element. Routes may be inserted in the same Insert command as Nodes, much like replaceScene. And also in IndexedValueInsertion as per the following examples:

  • Inserts a new node and a new Route

<Insert atNode="MyGroup" position="BEGIN">
<Shape DEF="MyRect">
<geometry>...</geometry>
<appearance>...</appearance>
</Shape>
<ROUTE fromNode="WhiteRect" fromField="emissiveColor"
toNode="ACircle" toField="emissiveColor"/>
</Insert>

  • Inserts a new value and Route in the same command

<Insert atNode="MyIndexedLineSet2D" atField="colorIndex" value="6">
<ROUTE fromNode="WhiteRect" fromField="emissiveColor"
toNode="ACircle" toField="emissiveColor"/>
</Insert>

2.6.3<Delete>

Delete command provides for node, Indexed value and Route deletion. For Delete atField defaults to value ‘children’ but position has no default.