Quick Press: the awesome guys from FFmpeg have made an official release this week. The days of pain for compiling and packaging FFmpeg have come to an end. FFmpeg is being used in many Web video sites to provide backend transcoding – FAIK that includes YouTube. I use FFmpeg for all my transcoding needs and it has never let me down. Open media software to the win!
Category Archives: Digital Media
Progress on captions for HTML5 video
Paul Rouget this week published another example implementation for using srt with HTML5 video with a javascript library. This is at least the fourth javascript implementation that I know of for attaching srt subtitles to the video element.
It is great to see such a huge need for this. At the same time I am also worried about the amount of incompatible implementations of this feature. It will inhibit search engines from realising which text relates to and describes a particular video. It will also inhibit accessibility technology such as screen readers or braille devices from realising there is text that would be necessary to be rendered.
A standard means of associating srt (or other format) subtitle files with the video tag is really necessary. So, where are we at with this?
Recently, Greg Millam from Google posted a proposal to WHATWG, that shares a lot of elements with the proposal that has been previously discussed between Mozilla, Xiph, and Opera, the current state of which is summarised in the Mozilla wiki. No implementation into a Browser has been made yet, but initial implementations in javascript exist. I think that we will ultimately come out with a harmonised solution between the browser vendors. It just needs implementation work and continuous improvement.
At the same time, in-band captions that come multiplexed within the Ogg file are also being progressed. At Xiph we are now focusing on using Ogg Kate for these purposes – it really don’t make much sense to invent another codec when Ogg Kate is already so close to solving most problems. So, between the developer of Ogg Kate and myself, we are preparing a Google Summer of Code project that should see a implementation for Firefox 3.1 that is capable of extracting the text from an Ogg file that has a Kate track and displaying that track as though it was a srt file. If you are interested, shoot me an email!
UPDATE: Firefox 3.1 is apparently now called Firefox 3.5 – sorry guys. 🙂
ANOTHER UPDATE: My post seemed to imply that Firefox 3.5 will have Ogg Kate support. This is not the case. There is a patch for Firefox and liboggplay to provide Ogg Kate support into Firefox and this patch will be the basis of the Summer of Code project. The student will then work mostly on implementing a comprehensive javascript library to display Ogg Kate encoded time-aligned text (read: captions, Karaoke etc) in the Web browser. This is a proof-of-concept and a first step towards standardising the handling of time-aligned text in Web browsers that suppor the HTML5 video tag.
Professional Tool support for open media codecs
Michael Dale from Metavid has posted an article on why we are about to hit the tipping point for professional video producers to move to open media codecs. His statement is that it’s not just because the H.264 licensing grace period is about to end, but has a lot to do with the support that open media codecs are increasingly seeing on the Web, where the next big professional video market will happen. I totally agree.
The increasing amount of open tools on the Web for open codecs was all stimulated by the HTML5 <video> element and is based on year-long efforts that have gone into Annodex and applications using Annodex technology such as Metavid. There is now Firefox 3.1’s native support for Ogg Theora/Vorbis through liboggplay, there is the mod_annodex Apache plugin and the oggzchop CGI tool to serve time ranges of Ogg media over HTTP from Web Servers, there is the new firefogg Firefox Ogg Theora/Vorbis encoding plugin, and this all closes the tool-chain from encoding to publishing to viewing.
Native editing of Ogg Theora/Vorbis video is still a challenge, but any professional video producer will not want to move away from their favorite tool for editing video anyway, so it is a matter of having an export function included into these professional editors. While such export functions will take some time to emerge in these proprietary editors, the use of ffmpeg2theora and similar transcoding tools will be perfectly sufficient to fulfill these needs.
If you want to see why open source codecs and open video technology make such a difference, just go and check out Metavid, the best software around for wiki-style editing of time-aligned annotations for long-form video. I look forward to all the cool new applications that will emerge with open media software on the Web – applications that are not possible with proprietary video technology because of their lack of flexibility, interoperability, and adaptability.
Website madness of marketing agencies
I have spent a lot of time recently researching Sydney-based agencies to invite to the upcoming Launch of our Vquence VQmetrics service. This involved finding their websites, finding out about their target business (do they do online video?), finding a relevant contact, and emailing an invitation to them.
I am close to institutional confinement!
I do understand that agencies need to show off their creativity on their Website. The result of this is that most agency Websites are completely written in Flash. Fortunately I have the latest version of Flash installed, so I can load them all. But my Web browser and MacBook do not deal well with having more than about 5 tabs open with Flash content – my machine almost grunts to a halt. So, there goes the idea of opening multiple tabs at the same time while waiting for the lengthy Flashs of the sites to load…
Then, once the pages are loaded, it is always a surprise to see what the agency has come up with. At the beginning of the exercise it was a surprise. Later it became a nuisance. Now, I am utterly terrified before opening another agency Website. Will it break my browser? Will it start playing a video? Will it start playing music so loud that it blasts off my ears? Will I feel really stupid because I cannot navigate the site? Will I be able to locate the “Contact Us” section? Will they have bothered to publish an email address or do I have to fill in a stupid contact form that I know nobody will look at? Will the contact email work or just bounce?
It almost feels like the creation of the Website is a competition between the agencies as to who can create the maddest, most unusual, and most unusable Website.
Please, please! Can I just have a simple, usable site with obvious navigation, a simple and fast loading list of reference work, and a list of key people working at the agency with their email contacts?
Oh, and Mumbrella has just published a post that gives me scientific proof that this is a conspiracy against me by the agencies! No, stop that – I am not ready to be locked up yet!
FOMS 2009 Awesomeness
I am a slacker, I know – sorry. FOMS happened almost 4 weeks ago and I have neither blogged about it nor uploaded the videos.
So, you will have to take my word for it for the moment: it was a totally awesome and effective workshop that led to a lot of work being started during LCA and having an impact far beyond FOMS.
Every year, the discussions we are having at FOMS are captured in so-called community goals. These are activities that we see as top priorities for open media software to be addressed to improve its use and uptake.
You can read up on our 2009 community goals here in detail. They fall into the following 10 sections:
- Patent and legal issues around codecs
- Ogg in Firefox: liboggplay
- Authoring tools for open media codecs
- Server Technology for open media
- Time-aligned text and accessibility challenges
- FFmpeg challenges
- GStreamer challenges
- Dirac challenges
- Jack challenges
- OpenMAX challenges
In this post, I’d just like to point out some cool activities that have already emerged since FOMS.
I’ve already written on the patents issue and how OpenMediaNow will hopefully be able to make a difference here.
Liboggplay provides a simple API to decoding and playback of Ogg codecs and is therefore in use for baseline Ogg Theora support in Firefox 3.1. A bunch of bugs were found around it and the opportunity of having Shane Stephens, its original developer, together with Viktor Gal, its new maintainer, in the same room made for a whole lot of bug fixes. The $100K Mozilla grant towards the work of Xiph developers that was announced at FOMS will further help to mature this and other Xiph software. Conrad Parker, Viktor Gal, and Timothy Terriberry, the Xiph developers that will cut code under this grant, were incidentally all present at FOMS.
The discussion about the need for authoring software support for open media codecs is always a difficult one. We all know that it is important to have usable and graphically attractive authoring tools in order to get adoption. However, looking at reality, it is really difficult to design and implement a GUI authoring tool such as a video editor to a competitive quality. In other areas, it has also taken quite some time to gain good authoring software such as e.g. the Gimp or Inkscape. Plus there is the additional need to make it cross-platform. With video, often the underlying editing functionality is missing from media frameworks. Ed Hervey explained how he extended gstreamer with the required subroutines and included them into the gstreamer python plugin, so now he will be able to focus on user interface work in PiTiVi rather than the underlying video editing functionality.
The authoring discussion smoothly led over to the server technology discussion. Robin Garvin explained how he implemented a server-side video editor through EDLs. Michael Dale showed us the latest version of his video editor in the Mediawiki Metavid plugin. And Jan Gerber showed us the Firefogg Firefox plugin for transcoding to Ogg. Web-based tools are certainly the future of video authoring and will make a huge difference in favor of Ogg.
Then there was the accessibility discussions. During FOMS I was in the process of writing up my final report on the Mozilla video accessibility project and it was really important to get input from the FOMS community – in particular from Charles McCathyNevile from Opera, Michael Dale from Metavid/Wikipedia/Archive.org and Jan Gerber. In the end we basically agreed that a lot of work still needs to be done and that a standard way of providing srt support into HTML5 through Ogg, but also out-of-band will be a great step forward, though by far not the final one.
The remaining topics were focused discussions on how to improve support, uptake or functionality of specific tools. Peter Ross took FOMS concerns about ffmpeg to the ffmpeg community and it seems there will be some changes, in particular an upcoming ffmpeg release. Ed Hervey took home a request for new API functions for gstreamer. Anuradha Suraparaju talked with Jan Gerber about support of Dirac in firefogg and with Viktor Gal about support in liboggplay. Further, the idea of libfisheye was born to have a similar abstraction library for Ogg video codecs as libfishsound is for Ogg audio codecs.
As can be seen, there are already some awesome outcomes from FOMS 2009. We are looking forward to a FOMS 2010 in Wellington, New Zealand!
Cool things you can do with theora in Firefox
In support of spreadfirefox.com, OZprod (no, they are not Australian, but French) this week released a video on Dailymotion that shows the cool stuff you can do with a video tag and Theora support in Firefox. It’s super-cool!
You will have to install Firefox 3.1 to try it our for yourself here.
$100K towards Xiph developers
Today, Wikimedia and Mozilla announced a grant provided by the Mozilla Corporation towards maturing the support of Ogg in the Firefox Web browser. I’m happy to have helped in making the proposal become concrete and now we have the following three Xiph developers working on it:
- Viktor Gal – the maintainer of liboggplay
- Conrad Parker – the key developer of multiple Ogg support libraries, in particular liboggz
- Tim Terriberry – the key developer of Ogg Theora
Viktor will work towards stabilising the current Ogg Theora support in Firefox, Conrad will work towards Ogg network seeking, language selection and improved library support, and Tim will include the new Thusnelda Theora encoder improvements into Theora mainstream.
Looking forward to awesome Firefox video technology!
UPDATE – Other posts on this topic:
LCA 2009 talk on video accessibility
During the LCA 2009 Multimedia Miniconf, I gave a talk on video accessibility. Videos have been recorded, but haven’t been published yet. But here are the talk slides:
I basically gave a very brief summary of my analysis of the state of video accessibility online and what should be done. More information can be found on the Mozilla wiki.
The recommendation is to support the most basic and possibly the most widely online used of all subtitle/captioning formats first: srt. This will help us explore how to relate to out-of-band subtitles for a HTML5 video tag – a proposal of which has been made to the WHATWG and is presented in the slides. It will also help us create Ogg files with embedded subtitles – a means of encapsulation has been proposed in the Xiph wiki.
Once we have experience with these, we should move to a richer format that will also allow the creation of other time-aligned text formats, such as ticker text, annotations, karaoke, or lyrics.
Further, there is non-text accessibility data for videos, e.g. sign language recordings or audio annotations. These can also be multiplexed into Ogg through creating secondary video and audio tracks.
Overall, we aim to handle all such accessibility data in a standard way in the Web browser to achieve a uniform experience with text for video and a uniform approach to automating the handling of text for video. The aim is:
- to have a default styling of time-aligned text categories,
- to allow styling override of time-aligned through CSS,
- to allow the author of a Web page with video to serve a multitude of time-aligned text categories and turn on ones of his/her choice,
- to automatically use the default language and accessibility settings of a Web browser to request appropriate time-aligned text tracks,
- to allow the consumer of a Web page with video to manually select time-aligned text tracks of his/her choice, and
- to do all of this in the same way for out-of-band and in-line time-aligned text.
At the moment, none of this is properly implemented. But we are working on a liboggtext library and are furher discussing how to include out-of-band text with the video in the Webpage – e.g. should it go into the Webpage DOM or in a separate browsing context.
If you feel strongly about video a11y, get involved at http://lists.xiph.org/mailman/listinfo/accessibility.
Top 10 commercials for 2008 on YouTube
I spent the last few days doing some nice research for Vquence, where I was able to watch lots of videos on YouTube. Fun job this is! 🙂 The full article is on the Vquence metrics blog.
One of the key things that I’ve put together is a list of top 10 commercials for 2008:
Rank | Video | Views | Added |
1 | Pepsi – SoBe Lifewater Super Bowl 2008 | 3,652,217 | February 02, 2008 |
2 | Cadbury – Gorilla | 3,338,011 | August 31, 2007 |
3 | Nike – Take it to the NEXT LEVEL | 3,184,329 | April 28, 2008 |
4 | Macbook Air | 2,648,717 | January 15, 2008 |
5 | Centraal Beheer Insurance – Gay Adam | 2,512,425 | May 30, 2008 |
6 | Vodafone – Beatbox | 2,380,237 | March 17, 2008 |
7 | E*Trade – Trading Baby | 2,061,818 | February 01, 2008 |
8 | Guitar Hero – Heidi Klum | 1,068,055 | November 03, 2008 |
9 | Bridgestone – Scream | 980,406 | January 30, 2008 |
10 | Bud Light- Will Ferrell | 966,177 | February 04, 2008 |
Favorable mention | OLPC – John Lennon | 527,953 | December 25, 2008 |
Favorable mention | Blendtec – iPhone 3G | 2,711,195 | July 11, 2008 |
Favorable mention | Stide Gum – Where the hell is Matt? | 15,859,204 | June 20, 2008 |
There are many more details over at vquence.com.
Enjoy! And let me know in the comments if you know of any other video ad released in 2008 in the same ballpark number of views that is an actual tv-style commercial.
NOTE: I just had to change the list, because the SoBe Lifewater Super Bowl ad of 2008 actually came out ahead. It’s difficult to discover an ad that has neither ad nor commercial in its annotations!
OSDC 2008 talks
The “Open Source Developer Conference” 2008 took place in Sydney between 2nd-5th December. I gave two talks at it:
As requested by the organisers, I just uploaded the slides to Slideshare, which incidentally can now also synchronise audio recordings of your talk to your slides. Here are my slides – even if they don’t actually give you much without the demo:
I had lots of fun giving the talks. The “YouTube” one talks about the Fedora Commons document repository and how we turned it into a video transcoding, keyframing, publication and sharing system. The one on Metavidwiki shows off the Annodex-technology using video wiki that is in use by Wikipedia. Most certainly, I also mentioned that open source CMS systems now have video extensions. However, they are not video-centric sites in general.
Of all the open source Web video technology, I find Fedora Commons and MetaVidWiki the most exciting ones. The former is exciting for its ability to archive and publish video and their metadata in a way that integrates with document management. The latter is even more exciting for using Ogg and the open Annodex technologies to create a completely open source system using open codecs, and for being the world’s second video wiki (just after CMMLwiki), but the first one to achieve wide uptake.