Metadata and Ogg

I am really excited about the huge progress we made at FOMS with metadata and Ogg. The metadata specifications are actually not Ogg-specific – only their mapping into Ogg is. Here are the things that I expect will make for a very structured and sensible distributed handling of metadata on the Web.

At FOMS, we started improving CMML and are now specifying the next version of CMML. CMML is a timed text description language that can easily be multiplexed alongside audio or video data. It is very flexible with its fields and satisfies needs for hypermedia, captions, annotations and other time-aligned text. We took out the Ogg dependencies and it can now be used in any media container format. The specification is now also in an XML schema rather than a DTD, which enables us to reuse modules from XHTML and make it generally more extensible.

We introduced ROE, a description language (or a “manifest”) for multitrack media files. It describes media tracks and their dependencies and thus goes much further than the old stream and import elements in CMML, that now have been deprecated.

ROE can be used to author multitrack media files – in the Ogg case to author Ogg files with a Skeleton track and multiple media tracks. We are in the process of extending Skeleton to incorporate the description of dependencies between logical bitstreams. To complete this, we will be creating a description of how to map ROE into Ogg/Skeleton and vice versa.

ROE can also be used to negotiate with a Web client what media streams to send from the complete manifest that is available on the server. For example, a Web client could request the German sound track with a movie rather than the default English one, and to add English subtitles. This requires a small protocol for negotiation, which can easily be build using Web infrastructure. We are introducing some new HTTP request/response parameters and specific URLs, such as e.g.,A2,TT2.

The set of ROE, Skeleton, CMML, and the HTTP and URI specifications will enable a very structured means of interacting with metadata-rich video on the Web. It will be distributed and integrated into the Web infrastructure, much like the Annodex set of technologies already is today.

Since I am also a business owner aside of being an open media enthusiast, let me add that I expect it to have a huge impact on online business around audio and video, enabling business processes and business models that are not possible today. Watch this space!