Category Archives: open codecs

Video Streaming from Linux.conf.au

You probably heard it already: Linux.conf.au is live streaming its video in a Microsoft proprietary format.

Fortunately, there is now a re-broadcast that you can get in an open format from http://stream.v2v.cc:8000/ . It comes from a server in Europe, but relies on transcoding here in New Zealand, so it may not be completely reliable.

UPDATE: A second server is now also available from the US at http://repeater.xiph.org:8000/.

Today, the down under open source / Linux conference linux.conf.au in Wellington started with the announcement that every talk and mini-conf will be live streamed to the Internet and later published online. That’s an awesome achievement!

However, minutes after the announcement, I was very disappointed to find out that the streams are actually provided in a proprietary format and through a proprietary streaming protocol: a Microsoft streaming service that provides Windows media streams.

Why stream an open source conference in a proprietary format with proprietary software? If we cannot use our own technologies for our own conferences, how will we get the rest of the world to use them?

I must say, I am personally embarrassed, because I was part of several audio/video teams of previous LCAs that have managed to record and stream content in open formats and with open media software. I would have helped get this going, but wasn’t aware of the situation.

I am also the main organiser of the FOMS Workshop (Foundations of Open Media Software) that ran the week before LCA and brought some of the core programmers in open media software into Wellington, most of which are also attending LCA. We have the brains here and should be able to get this going.

Fortunately, the published content will be made available in Ogg Theora/Vorbis. So, it’s only the publicly available stream that I am concerned about.

Speaking with the organisers, I can somewhat understand how this came to be. They took the “easy” way of delegating the video work to an external company. Even though this company is an expert in open source and networking, their media streaming customers are all using Flash or Windows media software, which are current de-facto standards and provide extra features such as DRM. It seems apart from linux.conf.au there were no requests on them for streaming Ogg Theora/Vorbis yet. Their existing infrastructure includes CDN distribution and CDN providers certainly typically don’t provide Ogg Theora/Vorbis support or Icecast streaming.

So, this is actually a problem founded in setting up streaming through a professional service rather than through the community. The way in which this was set up at other events was to get together a group of volunteers that provided streaming reflectors for free. In this way, a community-created CDN is built that can deal with the streams. That there are no professional CDN providers available yet that provide Icecast support is a sign that there is a gap in the market.

But phear not – a few of the FOMS folk got together to fix the situation.

It involved setting up Icecast streams for each room’s video stream. Since there is no access to the raw video stream, there is a need to transcode the video from proprietary codecs to the open Ogg Theora/Vorbis format.

To do this legally, a purchase of the codec libraries from Fluendo was necessary, which cost a whopping EURO 28 and covers all the necessary patent licenses. The glue to get the videos from mms to icecast streams is a GStreamer pipeline which I leave others to talk about.

Now, we have all the streams from the conference available as Ogg Theora/Video streams, we can also publish them in HTML5 video elements. Check out this Web page which has all the video streams together on a single page. Note that the connections may be a bit dodgy and some drop-outs may occur.

Further, let me recommend the Multimedia Miniconf at linux.conf.au, which will take place tomorrow, Tuesday 19th January. The Miniconf has decided to add a talk about “How to stream you conference with open codecs” to help educate any potential future conference organisers and point out the software that helps solve these issues.

UPDATE: I should have stated that I didn’t actually do any of the technical work: it was all done by Ralph Giles, Jan Gerber, and Jan Schmidt.

HTML5 Video element discussions at TPAC meetings

Last week’s TPAC (2009 W3C Technical Plenary / Advisory Committee) meetings were my second time at a TPAC and I found myself becoming highly involved with the progress on accessibility on the HTML5 video element. There were in particular two meetings of high relevanct: the Video Accessibility workshop and Friday’s HTML5 breakout group on the video element.

HTML5 Video Accessibility Workshop

The week started on Sunday with the “HTML5 Video Accessibility workshop” at Stanford University, organised by John Foliot and Dave Singer. They brought together a substantial number of people all representing a variety of interest groups. Everyone got their chance to present their viewpoint – check out the minutes of the meeting for a complete transcript.

The list of people and their discussion topics were as follows:

Accessibility Experts

  • Janina Sajka, chair of WAI Protocols and Formats: represented the vision-impaired community and expressed requirements for a deeply controllable access interface to audio-visual content, preferably in a structured manner similar to DAISY.
  • Sally Cain, RNIB, Member of W3C PF group: expressed a deep need for audio descriptions, which are often overlooked besides captions.
  • Ken Harrenstien, Google: has worked on captioning support for video.google and YouTube and shared his experiences, e.g. http://www.youtube.com/watch?v=QRS8MkLhQmM, and automated translation.
  • Victor Tsaran, Yahoo! Accessibility Manager: joined for a short time out of interest.

Practicioners

  • John Foliot, professor at Stanford Uni: showed a captioning service that he set up at Stanford University to enable lecturers to publish more accessible video – it uses humans for transcription, but automated tools to time-align, and provides a Web interface to the staff.
  • Matt May, Adobe: shared what Adobe learnt about accessibility in Flash – in particular that an instream-only approach to captions was a naive approach and that external captions are much more flexible, extensible, and can fit into current workflows.
  • Frank Olivier, Microsoft: attended to listen and learn.

Technologists

  • Pierre-Antoine Champin from Liris (France), who was not able to attend, sent a video about their research work on media accessibility using automatic and manual annotation.
  • Hironobu Takagi, IBM Labs Tokyo, general chair for W4A: demonstrated a text-based audio description system combined with a high-quality, almost human-sounding speech synthesizer.
  • Dick Bulterman, Researcher at CWI in Amsterdam, co-chair of SYMM (group at W3C doing SMIL): reported on 14 years of experience with multimedia presentations and SMIL (slides) and the need to make temporal and spatial synchronisation explicit to be able to do the complex things.
  • Joakim S

FOMS and LCA Multimedia Miniconf

If you haven’t proposed a presentation yet, got ahead and register yourself for:

FOMS (Foundations of Open Media Software workshop) at
http://www.foms-workshop.org/foms2010/pmwiki.php/Main/CFP

LCA Multimedia Miniconf at
http://www.annodex.org/events/lca2010_mmm/pmwiki.php/Main/CallForP

It’s already November and there’s only Christmas between now and the conferences!

I’m personally hoping for many discussions about HTML5 <video> and <audio>, including what to do with multitrack files, with cue ranges, and captions. These should also be relevant to other open media frameworks – e.g. how should we all handle multitrack sign language tracks?

But there are heaps of other topics to discuss and anyone doing any work with open media software will find a fruitful discussions at FOMS.

Cortado 0.5.0 released

Cortado is a java applet that provides support for Ogg Theora/Vorbis to Web publishers. It’s particularly useful to publishers that want to use Ogg Theora/Vorbis in Browsers that do not yet support the HTML5 video element with Ogg.

Cortado was originally developed by Fluendo SA under a LGPL license and contains a re-implementation of Theora and Vorbis in Java (jheora and jcraft). After a few years of low maintenance, the Wikimedia Foundation took it in their hands to undust the code for their use in the Wikimedia Commons, where only unencumberd open video format are acceptable.

As Ralph states in his announcement of the new release: earlier this year, Xiph.org took over maintenance of the Cortado java applet to help concentrate interest and expertise on this important component of the free media codec infrastructure. Therefore, the official website for Cortado is as now part of the Xiph. [If somebody could update the Wikipedia article – that would be awesome!]

So, I am very happy to point to the first Cortado release in three years. Source and sample builds are available from the Xiph.org download site.

Ralph writes further:

The new version is tagged 0.5.0 to indicate both the change in hosting and the significant new support for files from the new libtheora encoder implementation and Kate embedded subtitles.

In particular, 0.5.0 has:

  • Support for files encoded with Theora 1.1
  • Faster YUV to RGB conversion with better results
  • Basic support for embedded Ogg Kate streams
  • Seeking fixed for files with an Ogg Skeleton track
  • Maintained compatibility with the Microsoft VM

This is an awesome example of the power of open source and what a group of people can achieve. Congratulations to everyone at Xiph, Wikipedia, and anyone else who contributed to the release!

Web Directions South 2009 talk on HTML5 video

Yesterday, I gave a talk on the HTML5 video element at Web Directions South.

The title was “Taking HTML5 <video> a step further” and the abstract was provided goes as follows:

This talk focuses on the efforts engaged by W3C to improve the new HTML 5 media elements with mechanisms to allow people to access multimedia content, including audio and video. Such developments are also useful beyond accessibility needs and will lead to a general improvement of the usability of media, making media discoverable and generally a prime citizen on the Web.

Silvia will discuss what is currently technically possible with the HTML5 media elements, and what is still missing. She will describe a general framework of accessibility for HTML5 media elements and present her work for the Mozilla Corporation that includes captions, subtitles, textual audio annotations, timed metadata, and other time-aligned text with the HTML5 media elements. Silvia will also discuss work of the W3C Media Fragments group to further enhance video usability and accessibility by making it possible to directly address temporal offsets in video, as well as spatial areas and tracks.

Here are my slides:

Download the pdf from here.

There was also a video recording and I will add that here as soon as it is published.

UPDATE:
The video is available on Tinyvid:

I’m not going to try and upload this 50min long video to YouTube – with it’s 10 min limit, I won’t get very far.

WebJam 2009 talk on video accessibility

On Wednesday evening I gave a 3 min presentation on video accessibility in HTML5 at the WebJam in Sydney. I used a video as my presentation means and explained things while playing it back. Here is the video, without my oral descriptions, but probably still useful to some. Note in particular how you can experience the issues of deaf (HoH), blind (VI) and foreign language users:

The Ogg version is here.

New proposal for captions and other timed text for HTML5

The first specification for how to include captions, subtitles, lyrics, and similar time-aligned text with HTML5 media elements has received a lot of feedback – probably because there are several demos available.

The feedback has encouraged me to develop a new specification that includes the concerns and makes it easier to associate out-of-band time-aligned text (i.e. subtitles stored in separate files to the video/audio file). A simple example of the new specification using srt files is this:

<video src="video.ogv" controls>
   <itextlist category="CC">
     <itext src="caption_en.srt" lang="en"/>
     <itext src="caption_de.srt" lang="de"/>
     <itext src="caption_fr.srt" lang="fr"/>
     <itext src="caption_jp.srt" lang="jp"/>
   </itextlist>
 </video>

By default, the charset of the itext file is UTF-8, and the default format is text/srt (incidentally a mime type the still needs to be registered). Also by default the browser is expected to select for display the track that matches the set default language of the browser. This has been proven to work well in the previous experiments.

Check out the new itext specification, read on to get an introduction to what has changed, and leave me your feedback if you can!

The itextlist element
You will have noticed that in comparison to the previous specification, this specification contains a grouping element called “itextlist”. This is necessary because we have to distinguish between alternative time-aligned text tracks and ones that can be additional, i.e. displayed at the same time. In the first specification this was done by inspecting each itext element’s category and grouping them together, but that resulted in much repetition and unreadable specifications.

Also, it was not clear which itext elements were to be displayed in the same region and which in different ones. Now, their styling can be controlled uniformly.

The final advantage is that association of callbacks for entering and leaving text segments as extracted from the itext elements can now be controlled from the itextlist element in a uniform manner.

This change also makes it simple for a parser to determine the structure of the menu that is created and included in the controls element of the audio or video element.

Incidentally, a patch for Firefox already exists that makes this part of the browser. It does not yet support this new itext specification, but here is a screenshot that Felipe Corr

W3C Workshop/Barcamp on HTML5 Video Accessibility

Web accessibility veteran John Foliot of Stanford University and Apple’s QuickTime EcoSystem Manager Dave Singer are organising a W3C Workshop/Barcamp on Video Accessibility on the Sunday before the W3C’s annual combined technical plenary meeting TPAC.

The workshop will take place on 1st November at Stanford University – see details on the Workshop.

If you read the announcement, you will see that this is about understanding all the issues around video (and audio) accessibility, understanding existing approaches, and trying to find solutions for HTML5 that all browser vendors will be able to support.

The workshop is run under the W3C Hypertext Coordination Group and registration is required.

W3C membership is not required in order to participate in the gathering. However, you are required to contribute your knowledge actively and constructively to the Workshop. You must come prepared to present on one of the questions in this document to help inform the discussion and make progress on proposing solutions.

I am very excited about this workshop because I think it is high time to move things forward.

If I can get my travel sorted, I will present my results on the video accessibility work that I did for Mozilla. It will cover both: out-of-band accessibility data for video elements, as well as in-line accessibility data and how to expose a common API in the Web browser for them. I have recently experimented with encoding srt and lrc files in Ogg and displaying them in Firefox by using the patches that were contributed by OggK and Felipe into Firefox. More about this soon.

Tracking Status of Video Accessibility Work

Just a brief note to let everyone know about a new wikipage I created for my Mozilla work about video accessibility, where I want to track the status and outcomes of my work. You can find it at https://wiki.mozilla.org/Accessibility/Video_a11y_Aug09. It lists the following sections: Test File Collection, Specifications, Demo implementations using JavaScript, Related open bugs in Mozilla, and Publications.