Skip to content

ginger's thoughts

Silvia's blog

  • Home
  • Experience
  • Publications
  • Video Gallery
  • Talks and Interviews
  • Brief CV
  • Contact

W3C Media Annotations API standard

April 10, 2010code, Digital Media, open codecs, Open Source, standardsAPI, audio, HTML5, media annotations, media elements, media fragments, meta data, metadata, Ogg, skeleton, specification, video, vorbiscomment, W3Csilvia

Recently, I was asked to review the W3C Media Annotations specifications as they are about to go into Last Call (a state that comes before the request for implementations at the W3C).

The W3C Media Annotations group has defined a set of metadata that they believe is representative and common for media resources. The ontology consist of the following fields:

  • ma:identifier: a URI or string to identify a resource
  • ma:title: a string providing the title of the resource
  • ma:language: a language code describing the language used in the resource
  • ma:locator: the URI at which the resource can be accessed
  • ma:contributor: a URI or string identifying the contributor and the nature of the contribution
  • ma:creator: a URI or string identifying an author
  • ma:createDate: a date of creation or publication of the resource
  • ma:location: a string or geo code identifying where the resource has been shot/recorded
  • ma:description: a string describing the content of the resource
  • ma:keyword: a word or word combination providing a topic, keyword or tag representing the resource
  • ma:genre: a string providing the genre of the resource
  • ma:rating: rating value, including the rating scale
  • ma:relation: a URI and string identifying a related resource and the relationship
  • ma:collection: a URI or string providing the name of a collection to which the resource belongs
  • ma:copyright: a URI or string with the copyright statement.
  • ma:license: a string or URI with the usage license
  • ma:publisher: a string or URI with the publisher of the resource
  • ma:targetAudience: a URI and classification string providing the issuer of the classification and the classification value
  • ma:fragments: a list of string and URI values that identify media fragments and their type
  • ma:namedFragments: a list of string and URI values the provide names to media fragments
  • ma:frameSize: a width – height pair in pixels
  • ma:compression: a string providing the compression algorithm
  • ma:duration: a float to provide the resource duration in seconds
  • ma:format String: the mime type of the resource
  • ma:samplingrate: a float with the audio sampling rate
  • ma:framerate: a float with the video frame rate
  • ma:bitrate: a float providing the average bit rate in kbps
  • ma:numTracks: an int of the number of tracks

Note that some of these fields are not single values, but simple constructs of multiple values. Thus, they are actually more complex than name-value pairs that, e.g. are typically used in HTML meta headers or in Dublin Core. I regard this as an issue for implementations.

The fields were chosen as typical metadata being available about media resources. The media fragments fields are a bit dubious in this respect, but could be useful in future.

The metadata is determined either from within the resource itself or from a metadata collection about the resource. As such, the document maps several existing metadata and media resource formats to this interface, amongst them:

  • XMP
  • ID3
  • iTunes
  • QT
  • SearchMonkey
  • MediaRDF
  • LOM
  • METS
  • EXIF
  • CableLabs 1.1
  • CableLabs 2.0
  • DIG35
  • MIX
  • FRBR
  • MediaRSS
  • TXFeed
  • YouTube
  • VRA
  • IPTC
  • TVA
  • EBUCore
  • EBUP
  • MPEG7
  • SMTPD

As they didn’t have a mapping table for Ogg content, I offered the following:

MAWG Relation Ogg properties How to do the mapping Datatype
Descriptive Properties (Core Set)
Identification
ma:identifier exact Name Name field in skeleton header (new) String
ma:title exact Title TITLE field in vorbiscomment header String
exact Title Title field in skeleton header (new) String
related Album ALBUM title in vorbiscomment header String
ma:language exact Language Language field in skeleton header (new) language code
ma:locator exact file URI from system URI
Creation
ma:contributor exact Artist, Performer ARTIST and PERFORMER vorbiscomment headers Strings
ma:creator related Organization ORGANIZATION field in vorbiscomment header
ma:createDate exact Date DATE field in vorbiscomment header ISO date format
ma:location exact Location LOCATION field in vorbiscomment header String
Content description
ma:description exact Description DESCRIPTION field in vorbiscomment header String
ma:keyword N/A
ma:genre exact Genre GENRE field in vorbiscomment header String
ma:rating N/A
Relational
ma:relation related Version, Tracknumber VERSION (version of a title), TRACKNUMBER (CD track) fields in vorbiscomment header Strings
ma:collection related Album ALBUM field of vorbiscomment header String
Rights
ma:copyright exact Copyright COPYRIGHT field of vorbiscomment header String
ma:license exact License LICENSE field of vorbiscomment header String
Distribution
ma:publisher related Organization ORGNIZATION field of vorbiscomment header String
ma:targetAudience more specific Role Role field of Skeleton header (new) String
Fragments
ma:fragments N/A
ma:namedFragments N/A
Technical Properties
ma:frameSize exact extract from binary header of video track int, int (width x height)
ma:compression exact Content-type Content-type field of Skeleton header MIME type
ma:duration exact calculate as duration = last_sample_time – first_sample_time of OggIndex header of skeleton Float (or rather: rational – rational)
ma:format exact Content-type Content-type field of Skeleton header MIME type
ma:samplingrate exact calculate as granulerate = granulerate_numerator / granulerate_denominator of Skeleton header Rational (or rather int / int)
ma:framerate exact calculate as granulerate = granulerate_numerator / granulerate_denominator of Skeleton header Rational (or rather int / int)
ma:bitrate exact calculate as bitrate = length_of_segment / duration from OggIndex headers of skeleton Float
ma:numTracks exact Tracknumber TRACKNUMBER field of vorbiscomment header (track number on album) Int

You will notice that the table mentions 4 fields in skeleton with a “new” marker – they are actually proposed fields in skeleton – a bit of coding will be necessary to introduce them into software. The space for these fields already exists in message header fields, so it won’t require a change of the skeleton format.

In the second specification of the Media Annotations WG, the group offers a standard API to access (i.e. read) the defined fields. They also intend to create an API to write the fields, but I doubt that will be easy because of the vast amount of file types they intend to support.

There is basically a single function that allows the extraction of metadata:

MAObject[] getProperty(in DOMString propertyName, in optional DOMString sourceFormat, in optional DOMString subtype, in optional DOMString language, in optional DOMString fragment );

I proposed it may be possible to include this into HTML5 as follows:

interface HTMLMediaElement : HTMLElement {
...
getter MAObject getProperty(in DOMString propertyName, in optional unsigned long trackIndex);
...
}

This would either extract the property for a particular track in a media resource or for the complete resource if no track index is given. The only problem I see is that the returned object is different depending on the requested property – the MAObject is only a parent class for the returned object types. I am not sure it is therefore possible to specify this easily in HTML5.

Overall I thought the specification was a nice piece of work. I am not sure I agree with all the chosen fields, but that is always an issue with metadata. The most important fields are there and that’s what matters.

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn

Related

Post navigation

← Happy Ada Lovelace Day to Women in Multimedia R&D Introducing media accessibility into HTML5 →

Categories

Archives

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
Proudly powered by WordPress