Category Archives: code

Javascript libraries for support

Now that Firefox 3.5 is released with native HTML5 <video> tag support, it seems that there is a new javascript library every day that provides fallback mechanisms for older browsers or those that do not support Ogg Theora.

This blog post collects the javascript libraries that I have found thus far and that are for different purposes, so you can pick the one most appropriate for you. Be aware that the list is probably already outdated when I post the article, so if you could help me keeping it up-to-date with comments, that would be great. 🙂

Before I dig into the libraries, let me explain how fallback works with <video>.

Generally, if you’re using the HTML5 <video> element, your fallback mechanism for browsers that do not support <video> is the HTML code your write inside the <video> element. A browser that supports the <video> element will not interpret the content, while all other browsers will:


<video src="video.ogv" controls>
Your browser does not support the HTML5 video element.
</video>

To do more than just text, you could provide a video fallback option. There are basically two options: you can fall back to a Flash solution:


<video src="video.ogv" controls>
<object width="320" height="240">
<param name="movie" value="video.swf">
<embed src="video.swf" width="320" height="240">
</embed>
</object>
</video>

or if you are using Ogg Theora and don’t want to create a video in a different format, you can fall back to using the java player called cortado:


<video src="video.ogv" controls width="320" height="240">
<applet code="com.fluendo.player.Cortado.class" archive="http://theora.org/cortado.jar" width="320" height="240">
<param name="url" value="video.ogv"/>
</applet>
</video>

Now, even if your browser support’s the <video> element, it may not be able to play the video format of your choice. For example, Firefox and Opera only support Ogg Theora, while Safari/Webkit supports MPEG4 and other codecs that the QuickTime framework supports, and Chrome supports both Ogg Theora and MPEG4. For this situation, the <video> element has an in-built selection mechanism: you do not put a “src” attribute onto the <video> element, but rather include <source> elements inside <video> which the browser will try one after the other until it finds one it plays:


<video controls width="320" height="240">
<source src="video.ogv" type="video/ogg" />
<source src="video.mp4" type="video/mp4" />
</video>

You can of course combine all the methods above to optimise the experience for your users, which is what has been done in this and this (Video For Everybody) example without the use of javascript. I actually like these approaches best and you may want to check them out before you consider using a javascript library.

But now, let’s look at the promised list of javascript libraries.

Firstly, let’s look at some libraries that let you support more than just one codec format. These allow you to provide video in the format most preferable by the given browser-mediaframework-OS combination. Note that you will need to encode and provide your videos in multiple formats for these to work.

  • mv_embed: this is probably the library that has been around the longest to provide &let;video> fallback mechanisms. It has evolved heaps over the last years and now supports Ogg Theora and Flash fallbacks.
  • several posts that demonstrate how to play flv files in a <video> tag.
  • html5flash: provides on top of the Ogg Theora and MPEG4 codec support also Flash support in the HTML5 video element through a chromeless Flash video player. It also exposes the <video> element’s javascript API to Flash content.
  • foxyvideo: provides a fallback flash player and a JavaScript library for HTML5 video controls that also includes a nearly identical ActionScript implementation.

Finally, let’s look at some libraries that are only focused around Ogg Theora support in browsers:

  • Celt’s javascript: a minimal javascript that checks for native Ogg Theora <video> support and the VLC plugin, and falls back to Cortado if nothing else works.
  • stealthisfilm’s javascript: checks for native support, VLC, liboggplay, Totem, any other Ogg Theora player, and cortado as fallback.
  • Wikimedia’s javascript: checks for QuickTime, VLC, native, Totem, KMPlayer, Kaffeine and Mplayer support before falling back to Cortado support.

First draft of a new media fragment URI addressing standard

Those who know me well know that a few years ago (in fact, almost 10 years now) we developed the Annodex set of technologies at the CSIRO in a project called “Continuous Media Web”.

The idea was to make time-continuous data (read: audio and video) a integral part of the Web. It would be possible to search for media through standard search engines. It would be possible to link into and out of media as we link into and out of Web pages. It would be possible to mash up video from different Web servers into a single media stream just like we are able to mash up images, text and other Web resources from different Web servers.

As you are all aware, we have made huge steps towards this vision in the last 10 years. We now have what is called “universal search” – search engines like Google and Yahoo don’t return only links to HTML pages any longer, but return links to videos and images just as well.

But it doesn’t go far enough yet – even now we still cannot link into a long-form video to the right fragment that has the exact context of what we have been searching for.

In the Annodex project we implemented a working version of such a deep universal search engine in the year 2003 on top of the Panoptic search engine (a enterprise search engine developed by CSIRO, later spun out and now sold as Funnelback).

The basis for our implementation was the combination of specifications that we developed around Ogg:

  • An extension on Ogg that allows to create valid Ogg streams from subparts of Ogg streams – this is now part of Ogg as Skeleton.
  • A means of annotating Ogg streams with time-aligned text that could be interleaved with the Ogg media stream to produce streams that knew more about themselves – the format was called CMML for Continuous Media Markup Language.
  • And an extension to the URI addressing of Ogg streams using temporal URIs.

I am very proud that in the last 2 years, the development of a generic media fragment URI addressing approach has been taken up by the W3C and Conrad Parker and I are invited experts on the Working Group.

I am even more proud that the Working Group has just published a First Public Working Draft of a document called “Use cases and requirements for Media Fragments“. It contains a large collection of examples for situations in which users will want to make use of media fragments. It defines that the key dimensions of fragmentation that need to be specified are:

  1. Temporal fragmentation
  2. Spatial fragmentation
  3. Track fragmentation
  4. Name fragmentation

Beyond mere use cases and requirements, the document also contains a survey of technologies that address multimedia fragments.

In a first step towards the development of a Media Fragments W3C Recommendation, this document also discusses a proposed syntax for media fragment URI addressing and proposes different processing approaches. These sections will eventually be moved into the recommendation and are the most incomplete sections at this point.

To explain some of the approaches that are being proposed in more detail, here are some examples of media fragment URIs that are proposed through this WD:

  • http://www.example.com/example.ogv#t=10s,20s – addresses the fragment of example.ogv that lies between the 10s and the 20s offset
  • http://www.example.com/example.ogv#track='audio' – addresses the track called “audio” in the example.ogv file
  • http://www.example.com/example.ogv#track='audio'&t=10s,20s – addresses the track called “audio” on the subpart between the 10s and 20s offset in the example.ogv file
  • http://www.example.com/example.ogv#xywh=pixel:160,120,320,240 – addresses the example.ogv file but with a video track cut to a region of the size 320x240px positioned at 160x120px offset
  • http://www.example.com/example.ogv#id='chapter-1' – addresses the named fragment called “chapter-1” which is specified through some mechanism, e.g. Kate or CMML in Ogg

Note that the latter example works only if the encapsulation format provides a means of specifying a name for a fragment. Such a means is e.g. available in QuickTime through chapter tracks, or in Flash through cuepoints.

We know from our experience with Ogg that temporal fragmentation can be realized. For track addressing it is possible to use the recently developed ROE specification. The id tags used there could be included into Skeleton and then be used to address tracks by name. What concerns spatial fragmentation on Ogg Theora – I don’t think it can be achieved for an arbitrary rectangular selection without transcoding.

The next tasks of the Working Group are in creating implementations for these specifications on diverse formats and thus finding out which processes work the best.

FOMS 2009 Awesomeness

I am a slacker, I know – sorry. FOMS happened almost 4 weeks ago and I have neither blogged about it nor uploaded the videos.

So, you will have to take my word for it for the moment: it was a totally awesome and effective workshop that led to a lot of work being started during LCA and having an impact far beyond FOMS.

Every year, the discussions we are having at FOMS are captured in so-called community goals. These are activities that we see as top priorities for open media software to be addressed to improve its use and uptake.

You can read up on our 2009 community goals here in detail. They fall into the following 10 sections:

  1. Patent and legal issues around codecs
  2. Ogg in Firefox: liboggplay
  3. Authoring tools for open media codecs
  4. Server Technology for open media
  5. Time-aligned text and accessibility challenges
  6. FFmpeg challenges
  7. GStreamer challenges
  8. Dirac challenges
  9. Jack challenges
  10. OpenMAX challenges

In this post, I’d just like to point out some cool activities that have already emerged since FOMS.

I’ve already written on the patents issue and how OpenMediaNow will hopefully be able to make a difference here.

Liboggplay provides a simple API to decoding and playback of Ogg codecs and is therefore in use for baseline Ogg Theora support in Firefox 3.1. A bunch of bugs were found around it and the opportunity of having Shane Stephens, its original developer, together with Viktor Gal, its new maintainer, in the same room made for a whole lot of bug fixes. The $100K Mozilla grant towards the work of Xiph developers that was announced at FOMS will further help to mature this and other Xiph software. Conrad Parker, Viktor Gal, and Timothy Terriberry, the Xiph developers that will cut code under this grant, were incidentally all present at FOMS.

The discussion about the need for authoring software support for open media codecs is always a difficult one. We all know that it is important to have usable and graphically attractive authoring tools in order to get adoption. However, looking at reality, it is really difficult to design and implement a GUI authoring tool such as a video editor to a competitive quality. In other areas, it has also taken quite some time to gain good authoring software such as e.g. the Gimp or Inkscape. Plus there is the additional need to make it cross-platform. With video, often the underlying editing functionality is missing from media frameworks. Ed Hervey explained how he extended gstreamer with the required subroutines and included them into the gstreamer python plugin, so now he will be able to focus on user interface work in PiTiVi rather than the underlying video editing functionality.

The authoring discussion smoothly led over to the server technology discussion. Robin Garvin explained how he implemented a server-side video editor through EDLs. Michael Dale showed us the latest version of his video editor in the Mediawiki Metavid plugin. And Jan Gerber showed us the Firefogg Firefox plugin for transcoding to Ogg. Web-based tools are certainly the future of video authoring and will make a huge difference in favor of Ogg.

Then there was the accessibility discussions. During FOMS I was in the process of writing up my final report on the Mozilla video accessibility project and it was really important to get input from the FOMS community – in particular from Charles McCathyNevile from Opera, Michael Dale from Metavid/Wikipedia/Archive.org and Jan Gerber. In the end we basically agreed that a lot of work still needs to be done and that a standard way of providing srt support into HTML5 through Ogg, but also out-of-band will be a great step forward, though by far not the final one.

The remaining topics were focused discussions on how to improve support, uptake or functionality of specific tools. Peter Ross took FOMS concerns about ffmpeg to the ffmpeg community and it seems there will be some changes, in particular an upcoming ffmpeg release. Ed Hervey took home a request for new API functions for gstreamer. Anuradha Suraparaju talked with Jan Gerber about support of Dirac in firefogg and with Viktor Gal about support in liboggplay. Further, the idea of libfisheye was born to have a similar abstraction library for Ogg video codecs as libfishsound is for Ogg audio codecs.

As can be seen, there are already some awesome outcomes from FOMS 2009. We are looking forward to a FOMS 2010 in Wellington, New Zealand!

Attaching subtitles to HTML5 video

During the last week, I made a proposal to the HTML5 working group about how to support out-of-band time-aligned text in HTML5. What I mean by that is basically: how to link a subtitle file to a video tag in HTML5. This would mirror the way in which in desktop-players you can load separate subtitle files by hand to go alongside a video.

My suggestion is best explained by an example:


<video src="http://example.com/video.ogv" controls>
<text category="CC" lang="en" type="text/x-srt" src="caption.srt"></text>
<text category="SUB" lang="de" type="application/ttaf+xml" src="german.dfxp"></text>
<text category="SUB" lang="jp" type="application/smil" src="japanese.smil"></text>
<text category="SUB" lang="fr" type="text/x-srt" src="translation_webservice/fr/caption.srt"></text>
</video>

  • “text” elements are subelements of the “video” element and therefore clearly related to one video (even if it comes in different formats).
  • the “category” tag allows us to specify what text category we are dealing with and allows the web browser to determine how to display it. The idea is that there would be default display for the different categories and css would allow to override these.
  • the “lang” tag allows the specification of alternative resources based on language, which allows the browser to select one by default based on browser preferences, and also to turn those tracks on by default that a particular user requires (e.g. because they are blind and have preset the browser accordingly).
  • the “type” tag allows specification of what actual time-aligned text format is being used in this instance; again, it will allow the browser to determine whether it is able to decode the file and thus make it available through an interface or not.
  • the “src” attribute obviously points to the time-aligned text resource. This could be a file, a script that extracts data from a database, or even a web service that dynamically creates the data
    based on some input.

This proposal provides for a lot of flexibility and is somewhat independent of the media file format, while still enabling the Web browser to deal with the text (as long as it can decode it). Also note that this is not meant as the only way in which time-aligned text would be delivered to the Web browser – we are continuing to investigate how to embed text inside Ogg as a more persistent means of keeping your text with your media.

Of course you are now aching to see this in action – and this is where the awesomeness starts. There are already three implementations.

First, Jan Gerber independently thought out a way to provide support for srt files that would be conformant with the existing HTML5 tags. His solution is at http://v2v.cc/~j/jquery.srt/. He is using javascript to load and parse the srt file and map it into HTML and thus onto the screen. Jan’s syntax looks like this:


<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="jquery.srt.js"></script>

<video src="http://example.com/video.ogv" id="video" controls>
<div class="srt"
data-video="video"
data-srt="http://example.com/video.srt" />

Then, Michael Dale decided to use my suggested HTML5 syntax and add it to mv_embed. The example can be seen here – it’s the bottom of the two videos. You will need to click on the “CC” button on the player and click on “select transcripts” to see the different subtitles in English and Spanish. If you click onto a text element, the video will play from that offset. Michael’s syntax looks like this:


<video src="sample_fish.ogg" poster="sample_fish.jpg" duration="26">
<text category="SUB" lang="en" type="text/x-srt" default="true"
title="english SRT subtitles" src="sample_fish_text_en.srt">
</text>
<text category="SUB" lang="es" type="text/x-srt"
title="spanish SRT subtitles" src="sample_fish_text_es.srt">
</text>
</video>

Then, after a little conversation with the W3C Timed Text working group, Philippe Le Hegaret extended the current DFXP test suite to demonstrate use of the proposed syntax with DFXP and Ogg video inside the browser. To see the result, you’ll need Firefox 3.1. If you select the “HTML5 DFXP player prototype” as test player, you can click on the tests on the left and it will load the DFXP content. Philippe actually adapted Jan’s javascript file for this. And his syntax looks like this:


<video src="example.ogv" id="video" controls>
<text lang='en' type="application/ttaf+xml" src="testsuite/Content/Br001.xml"></text>
</video>

The cool thing about these implementations is that they all work by mapping the time-aligned text to HTML – and for DFXP the styling attributes are mapped to CSS. In this way, the data can be made part of the browser window and displayed through traditional means.

For time-aligned text that is multiplexed into a media file, we just have to do the same and we will be able to achieve the same functionality. Video accessibility in HTML5 – we’re getting there!

Resurrecting old Maaate code

Have you ever been haunted by an old open source package that you wrote once, published, and then forgot about?

The BSD community has just reminded me of the MPEG audio analysis toolkit Maaate that I wrote at CSIRO when I first came to Australia and that was then published through the CSIRO Mathematical and Information Sciences division.

The BSD guys were going to remove it from their repositories, because since I left CSIRO more than 2 years ago, CSIRO has taken down the project pages and the code, so there were no active project pages available any longer. I’m glad they contacted me before they did so.

Since it is an open source project, I have now resurrected the old pages at Sourceforge. They are available from http://maaate.sourceforge.net/. I have re-instated the relevant weg pages and documentation and updated all the links. I discovered that we did some cool things then and that it may indeed be worth preservation for the future. I expect Sourceforge is up to the task.

Thanks very much, BSD community and welcome back, MPEG Maaate!

FOMS submission deadline extended

The Foundations of Open Media Software workshop has just extended its deadline for submission of registrations requests with travel sponsorship.

FOMS addresses hot topics – such as the new <video> and <audio> tags in HTML5, the uptake and development of open video codecs like Ogg Theora, BBC’s Dirac and SUN’s OMS codec and their native support in Firefox, open audio & media frameworks and players such as gstreamer, ffmpeg, vlc or xine, or the standardisation of audio APIs across platforms. Further topics are listed in the CFP.

In previous years, FOMS has stimulated heated technical discussions and amazing new developments in open media software, such as the creation of libsydneyaudio, the uptake of liboggplay, the creation of Xiph ROE, or the creation of the new Ogg CELT codec.

Video proceedings of last years’ workshops are here. There are also community goals that were set in 2008 and 2007 and provide ongoing challenges.

You should definitely attend, if you are an open media software hacker. This is a chance to get to know others in the community personally and clear up those long-standing issues that need a face-to-face to get solved. Also, it’s a great social event not to be missed. As a bonus, you can spend the week after FOMS at LCA, the world-famous Australian Linux hackers conference, and deepen your relationships in the community. Come and join in the fun in January 2009, Summer in Hobart, Tasmania.

Date & Time selection plugins for Rails

I just tried comparing the different date & time selection plugins that are available for Rails and because the wiki page at rubyonrails.org is dated and I cannot edit it, I decided to write this brief blog post to save you the 20 min it took me to locate the currently available plugins and their demos:

So now you can compare and pick the one that suits you best. Enjoy! 🙂

UPDATE: John just told me that there is a list of other plugins at the bottom of the CalendarDateSelect plugin page – doh!

W3C Video in the Web activity

The W3C has just released a set of proposed charters for a new W3C Video in the Web activity with a request for feedback.

The following working groups are proposed:

  1. Timed Text Working Group
  2. Media Fragments Working Group
  3. Media Annotations Working Group

Two further ones under investigation are:

  1. Codecs and containers
  2. Best practices for video and audio content

It is worth checking out the site and the three different working groups they are planning to create. Sure – the codec discussion is a big one. But it’s not as big as some of the other activities as to new functionality for video on the Web.

to_bool rails plugin

In our rail application we do a lot of string conversions to other data types, including Boolean. Unfortunately, ruby does not provide a conversion method to_bool (which I find rather strange, to be honest).

Based a blog post by Chris Roos in October 2006, we developed a rails plugin that enables the “to_bool” conversion.

“to_bool” works on the strings “true” and “false” and any capitalisation of these, and on numbers, as well as on nil. Other strings raise an ArgumentError.

Examples are as follows:

'true'.to_bool #-> true
'TrUe'.to_bool #-> true
true.to_bool #-> true
1.to_bool #-> true
5.to_bool #-> true
-9.to_bool #-> true
nil.to_bool #-> false
'false'.to_bool #-> false
'FaLsE'.to_bool #-> false
false.to_bool #-> false
0.to_bool #-> false

You can find the plugin here as a tarball. To install it, simply decompress the to_bool directory into your vendor/plugins directory.

Google Developer Day in Sydney

Today’s Google Developer day in Sydney was quite impressive. There were about 500 developers (and other random folks) there, curious to learn more about the services Google offers. With three parallel breakout rooms for the talks / code labs, there was plenty to choose from.

The introduction was well done, providing a quick overview of all the services and APIs that were the topic of the day – enough to understand what they are and tempt you to attend the in-depth talks.

I attended the Google App Engine talk first – not because I am a fan of python, but because I have an AppEngine account and my son Ben codes in python. I’d really like to play with AppEngine and get Ben to develop something useful (and me to learn some more python on the side). The talk gave a great introduction, which really enthused me. They build this little shout-out app on the fly and published it within the first 10 min. Now, I am collecting ideas for Ben to code up – if you have a neat little one, leave me a note.

I then went on to attend the YouTube talk. It was touted as a 201 presentation, but in the end just provided a cursory overview of the YouTube API. It was a good overview, but since we’ve been working with the API for a long time at Vquence, there was nothing new for me.

At the end of the talk, a developer requested YouTube to provide an API to access the new annotation feature. Since that is still in beta, they will be waiting to harden the technology for a bit before introducing the API. I suggested to them to look at CMML as the XML API. I explained that it would hold any annotation at any time point in any language. The on-screen placement is not currently covered by a tag in CMML, but could be added to the meta tags of the clips. I also suggested that if they found anything to improve on CMML, it would be possible since it’s not a finalized standard. I really hope they will check it out.

After lunch, I attended the two sessions about OpenSocial. I was considering using it for the new Vquence metrics site to do the widget layout. I quickly understood that this is not about layouts, but really about social applications. It would be cool if Ben would think up a social application that he could implement in python and OpenSocial and host in AppEngine. Something that him and his friends could share, maybe? Any ideas for an 11-year-old who learnt python on the OLPC?

At the end of the day, I was curious to learn a bit about Android before having to head home. But I only had a few minutes and the speaker had a slow start (repeating his slides from the introduction session) such that I quickly decided to leave and rather make sure I was at after school care on time!

Overall a worthwhile day – I met some friends, made some contacts, got to ask some questions, and had an awesome lunch with fresh sushi, hmmm. Google really knows how to spoil their developers!