SMIL Technology
Definition and origins

SMIL (pronounced ‘smile’) is an abbreviation for the Synchronized Multimedia Integration Language. It is a W3C Recommendation for describing multimedia presentations using XML (Extensible Markup Language). Put simply, this means that all parties interested in developing SMIL can work to a recommended set of guidelines. It defines timing, layout, animations, visual transitions and media embedding, among other things.
SMIL 1.0 became an official recommendation in June 1998. SMIL 2.0 became an official recommendation in August 2001. SMIL 2.1 became an official recommendation in December 2005. SMIL 2.1 includes a small number of extensions based on practical experience gathered using SMIL in the Multimedia Messaging System on mobile phones.
Further technical details can be found elsewhere on this website at SMIL/Technical material.
In the beginning…

Way back in the early days of the internet (mid to late 1990s!), some people started experimenting with putting video onto the web. One of the leading proponents of ‘streamed media’ (as the practice of running video and/or audio on the web became known) was the BBC, where Huw Owen and Phil Stead, both at that time in senior roles at the BBC and now colleagues in Adastra Education, were at the very forefront of this developmental work.
Originally, when video first appeared on the net, it would appear in a separate window, away from the website to which it was associated. At best, this was a cumbersome way of using rich media in the interactive environment. This was but the tip of the iceberg of numerous further difficulties caused by this sort of approach – sending people to a separate window by definition drove them away from the website they had originally been using, there was no opportunity at all for the website to interact directly with the rich media stream, and there were all sorts of problems regarding access to the proprietorial media players required to run rich streamed media.
The main problem facing online multimedia producers who wanted to combine video with different elements, such as text or graphics, was the limited amount of bandwidth available to most users, who in the UK were almost exclusively using 56kbps modems to access the web. This resulted in poor-quality video, and there was a need for a way to synchronise the footage with other multimedia elements.
In this environment, SMIL was born…
SMIL was designed to overcome most of the difficulties encountered above – it was meant to give the user a full interactive experience with minimal intrusion. SMIL was created specifically to solve the problems of coordinating the display of a variety of media (multimedia) on websites. By using a single time-line for all of the media on a page, their display could be properly time-coordinated and synchronised.
But SMIL went further than that. Using SMIL, the media player could detect the connection speed of the user and supply the encoded video at a level that would maximise quality while ensuring a constant delivery. It could even detect the user’s nationality through the version of Windows that was being used, and offer content in the user’s own language.
Problems appear

Development was ongoing, but it was Real Networks who really championed the cause of SMIL, with some limited support from Quicktime. Unfortunately, Microsoft spurned the opportunity offered by SMIL 1.0 and instead pursued its own solution for the Windows Media Player, HTML+TIME, a relatively complicated solution, which never grabbed the imagination of multimedia developers. It was incompatible with any non-Microsoft media player and has generally fallen into disuse. Microsoft still dabble with HTML+SMIL, but with over a dozen SMIL players now available, developers prefer to use a language which will work across most platforms.
A very simplistic example of one type of SMIL application is given below, for illustration:
Island For Sale
Huw Owen’s close association with Real Networks at the BBC provided the foundation for Adastra Education’s interest in combining video, text and graphics to deliver training via the web. In 2001, Huw and Phil were described by Real Networks as “Northern Europe’s leading proponents of SMIL”.
Early SMIL development was done mainly through the Real Media player (as seen in the example above). While some degree of success was achieved using this player, problems also started to appear – for example, the heavy branding and links (which SMIL perversely allowed into the player, by definition!) were seen as intrusive and counter-productive.
The problems emerged incrementally. Real began to bury its free version of the player deep on its website. Clients became increasingly frustrated at the hard sales pitch that Real was using to encourage sales of its Premium player with its USA-interest ‘bonus’ content.
The video content in a SMIL presentation had to be stored on Real’s proprietary media servers to work effectively. These were expensive to purchase. Video hosting wasn’t cheap, but even worse was the refusal of hosting companies to offer a pre-defined pricing structure for video hosting. Video was priced on the basis of data downloaded by viewers. Clients were understandably reluctant to expose themselves to potentially unlimited costs.
But above all, video-streaming across the web remained an unsatisfactory experience, with narrow bandwidth allowing only the smallest video window, and only the lowest quality encoding available for most UK users.
Evolution of the SMIL concept
At this stage, Huw and Phil began looking for alternative solutions to deliver their SMIL content. After months of market research, they discovered that clients and users were left unsatisfied with the non-tactile nature of the product. While SMIL existed in cyberspace, students and staff liked to be able to hold the product, like a textbook. Without a tangible item, they felt short-changed.
A decision was made to focus development work into producing SMIL content onto CD-ROM rather than the web. Initally, this simply meant holding SMIL files on the CD with the Real player included. But as this developed, it became apparent that users simply did not want to be forced to install third-party software as part of their studies/training.
The Real Player had to go, and SMIL became a generic term for any digital media piece that contained more than one media working simultaneously on the same screen, in the same window – which is the interpretation of SMIL used by Adastra Education in its ESF bid.
Evolving the SMIL approach
In a nutshell, Adastra Education’s strategy was to develop standard programming techniques to create media without the delivery restrictions placed on SMIL. This enabled the products of the project to be delivered through much more ‘user friendly’ media. The approach remained constant: creating integrated multimedia learning modules that allow all the multimedia elements – video, sound, text and stills – to appear simultaneously in the same rich learning environment. The method was pragmatic: while research continued into the most recent incarnations of Real’s SMIL technology (see, for example, real live work focused increasingly on delivery through not only the web but also CD-ROM and DVD.
There is a commonsense pragmatic edge to the approach we adopted. Since beginning the ESF project, we’ve realised more and more the ‘Digital Divide’ which exists in Wales, re-affirmed recently by WAG figures for broadband, CD-ROM and DVD usage throughout Wales. To have continued to deliver predominantly via the web, at least for the time being, would have been self-defeating in trying to reach our prime target audiences.
So we’ve used various bits and pieces of SMIL, as we’d partially foreseen when we originally laid out the project, added some new technologies and developed some of our own, and come up with a series of products which hopefully are more accessible to the majority of people in Wales, especially in Objective 1 areas, where we know that broadband take-up is comparatively low and actual broadband usage is minimal.

Adastra Education’s ESF application stated:
“[The project] will develop a flexible tool for delivering interactive training and teaching content, online or via CD/DVD, using streamed media enhanced with interactive content programmed into the media player itself. The programming language used will be SMIL (Synchronized Multimedia Integration Language), a type of computer programming specifically related to the interactive enhancement of a streamed media player, which allows the beneficiaries a great deal of intuitive and direct interactivity through the media player itself. This represents a major breakthrough in the usability of computers for everyday training and educational interaction. However, the complex programming skills involved have been mastered so far by only a handful of companies in Europe. Work will be done with (a) sports coaches, players and students, and (b) performance art teachers, students and professionals to create, develop and test original video and interactive content in these two educational fields, and to transfer SMIL programming and production skills to ICT staff and students.
Digital/streamed video, either live or recorded, online or on CD/DVD, can play a major role in training and education. Its true potential lies in interactively integrating streamed video with text, graphical and audio content, all under the control of the end user – or, in a live scenario, of the lecturer/tutor.
The programming language SMIL (Synchronised Multimedia Integration Language), and the technologies derived from and associated with it, are powerful tools that allow creative people to manipulate the video stream and insert the coding required to drive the rich interactivity with other media and with the user. This represents a major breakthrough in the usability of computers, and massively boosts their potential usefulness for everyday training and educational interaction.
However, as a medium for e-learning and computer-based training, all of this remains in its infancy. First, there is the complex programming required to enhance streamed content with dynamic interactivity. The handful of SMEs around Europe who have successfully mastered SMIL programming have found it very difficult to exploit commercially because trained people to expand their skilled in-house teams have simply not been available. There is, in other words, a skills shortage. Colleges are not yet teaching people to become “rich interactive media producers” because there are no experts out there able to teach SMIL programming. Therefore there is no huge demand yet from the market, because there is a dearth of qualified suppliers creating the demand.
Another severe problem with the development of SMIL has been the failure for anyone to develop an effective but user-friendly WYSIWYG development tool. SMIL is still best coded by hand, a time-consuming and repetitive operation with no room for error.
And of course, until recently broadband was not widely available, and DVD remained a minority product. Both these scenarios are now changing rapidly, and the technical and commercial environment to exploit the benefits of this advanced technology has finally started to appear.
To sum up technically, digital video – predominantly DVD and mini DV formats – will be used to capture original footage, using PC-based editing configurations, predominantly Adobe Premier and Media Cleaner, to edit, compress and encode the footage for publication. The learning materials will then be published on the internet, CD-ROM and DVD, as appropriate to the access needs of specific users. Director and Flash will be used to add interactive elements to the finished video. SMIL (Synchronised Multimedia Integration Language) programming will be used within the media player to achieve synchronicity between video and interactive text. To our best knowledge, nowhere else in Wales is this specific ICT configuration being used to achieve the sorts of applications that will be produced.
The benefits of video as a teaching tool are manifold. Considerable savings are possible because it is not physically necessary to put a teacher in front of a class to teach the same lesson over and over again. Using motion control techniques, video can be used to produce a better quality of learning in certain visually oriented teaching disciplines, such as sports and performing arts. The use of rich interactive media allows numerous teaching messages to be put across in a more comprehensive way than the traditional classroom lecture scenario. Notes can be downloaded online. They can be illustrated and emphasised by video, audio, interactive text and graphics. Students can become accountable in terms of every element of the course they have accessed and studied. And using this technology, tracking in minute detail of students’ work by tutors becomes a very simple exercise.
In a nutshell, the proposed technology is (1) innovative, (2) cost-saving and (3) an enhancement both to the pupil’s quality of the learning and the tutor’s ability to track the process of learning."

SMIL: TECHNICAL DATA

“SMIL (pronounced smile) stands for Synchronized Multimedia Integration Language. It is a markup language (like HTML) and is designed to be easy to learn and deploy on Web sites. SMIL was created specifically to solve the problems of coordinating the display of a variety of media (multimedia) on Web sites. By using a single time line for all of the media on a page their display can be properly time coordinated and synchronized.”

(fromStreaming Media World)

This document specifies version 1 of the Synchronized Multimedia Integration Language (SMIL 1.0). SMIL allows integration of a set of independent multimedia objects into a synchronized multimedia presentation. Using SMIL, an author can:

  1. describe the temporal behaviour of the presentation
  2. describe the layout of the presentation on a screen
  3. associate hyperlinks with media objects

The "layout" element determines how the elements in the document's body are positioned on an abstract rendering surface (either visual or acoustic).

If a document contains no layout element, the positioning of the body elements is implementation-dependent.

A SMIL document can contain multiple alternative layouts by enclosing several layout elements within a "switch" element (defined in Section 4.3). This can be used, for example, to describe the document's layout using different layout languages.

The following example shows how CSS2 can be used as an alternative to the SMIL basic layout language:

(from W3C Recommendation paper December 2005)

  • Grins (SMIL1.0) by Oratrix
  • HPAS by Compaq
  • Lp player by Productivity Works
  • QuickTime 4.1 by Apple
  • Realplayer 8 by RealNetworks
  • Soja, a Java based SMIL player by Helio
  • S2M2 , a Java Applet-based SMIL Player by NIST
  • Schmunzel , a Java player by SunTREC Salzburg.
  • X-SMILES a Java based open browser by TML laboratory
  • Autometa RPXP is an open-source (LGPL v2.1) object-oriented Perl 5.005 script. It generates SMIL 1.0 and RealText streaming media presentations.

(From MetaData website)

01 Sep 2002

SMIL 2.0, the Synchronized Multimedia Integration Language, has begun to establish itself as an important new approach for integrating multimedia into web content. SMIL, which offers XML-based approaches for controlling the timing and presentation of multimedia elements, has begun to attract the support of many large software vendors and toolmakers, making it increasingly accessible for developers. In this article, Anne Zieger provides an overview of SMIL and describes several tools available to make SMIL coding simpler.
More dW content related to: smil development history

For developers outside the multimedia world, the Synchronized Multimedia Integration Language, or SMIL, may be something of an obscure technology. But at least among a few key players, SMIL has begun to establish itself as an important approach to presenting multimedia online.
SMIL support has crept into technologies backed by Adobe, Microsoft, and perhaps most prominently, media delivery leader Real Networks. A wide variety of smaller vendors have begun to provide SMIL authoring tools and players as well.
In days to come, as support for the current 2.0 specification grows, working with SMIL could become a standard strategy for any developer whose work requires some form of multimedia asset control. If the growing roster of tool creators is any indication, building presentations in SMIL should become easier as well.
SMIL history and overview
SMIL has been in development since March 1997, when the World Wide Web Consortium (W3C) established a working group on synchronized multimedia.
SMIL is an XML-based language that allows authors to write interactive multimedia presentations without using multimedia management tools such as Macromedia Director. Authors can describe the timing of multimedia presentations, associate hyperlinks with media objects and define the layout of the presentation onscreen. The SMIL 2.0 spec, for its part, is a series of markup modules defining semantics and XML syntax for certain SMIL functions.
The W3C released the first version of SMIL in November 1997, attracting a moderate level of industry attention, including some support from Real, Adobe, and Microsoft.
With the 2.0 version of SMIL, released in August 2001, these companies remain on board; in addition, more than a dozen independently crafted SMIL authoring platforms have arrived on the market. According to W3C documents, SMIL 2.0 has two main design goals: