Google summer of code

If you’re a student and keen to get more open media technology to the Web, apply for a Google summer of code project with Xiph. There are also a few Annodex-style projects in the mix, which bring annotations and metadata to Ogg.

Your interest could be with javascript, ruby, php, XML, or C no matter – you will find a project at Xiph to suit your favorite programming language.

Of the list of proposed projects, my personal favorite is OggPusher – a browser plugin for transcoding video to Theora. Imagine an online service for transcoding video to Ogg Theora without having to worry about having all the libraries installed.

You also have the chance to propose your own project to the Xiph/Annodex guys – you just need to find somebody who is willing to mentor you, so hop on irc channel #xiph at freenode.net and start discussing.

Incidentally, Google is providing a financial reward for successful conclusion of a project – but don’t let that be your only motivation. If you’re not in it with your passion, don’t do a GSoC project. This is about interacting with an open source community whose goals you can identify with. Become involved!

Choosing a Web charting library

At Vquence, until now we have used the Thumbstacks chart library for our graphs. TSChartlib is a simple open source charting library that uses the HTML canvas and an IE hack to create its graphs.

Vquence is now getting real serious with charts and graphs and we were thus looking for a more visually compelling and more flexible alternative. If you do a google search for “online charting library“, you get a massive amount of links to proprietary systems (admittedly, some of them offer the source code if you pay a premium). I will not be listing them here – go find them for yourself. However, the world of decent open source charting libraries is relatively small, so I want to share the outcome of my search.

There is the Open Flash Chart libary, which provides charting functionality for flash or flex programmers. The charts look rather nice and have overlays on the data points, which is something I missed thoroughly from TSChartlib.

There is a open source flex component called FlexLib which only does line charts, IIUC.

There is PlotKit, a Chart and Graph Plotting Library for Javascript. It has support for HTML Canvas and also SVG via Adobe SVG Viewer and native browser support and looks quite sexy.

Then there is JFreeChart, a 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. Another Java charting library is JOpenChart. Incidentally, there’s a whole swag of further Java libraries that do charts and graphing. However, we are not too keen on Java for Web technologies.

Outside our area of interest, there are also open source chart libraries in C#, but C#/.NET is not a platform we intend to support, so these were out of the question.

Our choice came down to the “Open Flash Chart” library vs “Plotkit”. Of the two, the Flash library and technology seems more mature, easier to use, and creates sexier charts. Also, we can sensibly expect all Vquence users to have Flash installed, while we cannot expect the same to be true for SVG. However, I was fascinated by the flexible use of SVG and HTML Canvas and will certainly get back to it later, when I expect it to have matured a bit more.

Our choice of the Open Flash Chart was further facilitated by a rails plugin for it. Score!

Of course: I might have totally missed some obvious or better alternatives. So, leave me a reply if you think I did!

Australian Startup Carnival – Winners Announced

As mentioned earlier, Vquence took part in the Australian Startup Carnival and winners are now announced.

The feedback we got from the judges is encouraging. It’s great to see that Vquence is indeed providing a useful tool. But we are also aware that the service offering is not complete and needs a lot more tech development.

I’d like to address the concern of one judge that we are dependent on YouTube’s goodwill to keep their access open. This is not the case. Not only has YouTube just in the last days opened up their API even further, so I don’t think there’s a risk there. But in general: closing access to content is not what the Web is about – on the contrary – Yahoo is just opening up their search platform and Tim Berners-Lee’s Semantic Web will enable an even more open exchange of data between different sites. However, Vquence does not rely solely on the availability of such data interfaces. That would be dumb. Where we cannot use APIs or RSS feeds or other data interfaces, we can always parse plain video Web pages, just like Google’s search engine parses Web pages. In short: our life would be harder without open interfaces, but not impossible.

As for the position that Vquence achieved in the Australian Startup Carnival: Vquence came in 5th position out of 28 participants, which is great, in particular since we are currently in a transition phase towards video metrics.

Xiph Mime Types and File Extensions

Late last year at Xiph we worked over our mime types and file extensions for Xiph content. The new set avoids using .ogg for everything and gives the different Xiph audio files a .oga (audio/ogg) and the Xiph video files a .ogv (video/ogg) extension, while using .ogx for more generic multiplexed content in Ogg. It’s important to separate between audio-only and video files – the codecs inside don’t really matter as much to select the appropriate application to start for using the file.

Today I read Fabian’s blog entry – one up for Ubuntu for getting behind it: https://bugs.edge.launchpad.net/ubuntu/+bug/201291 rock!

Extended Blog and Content

I have spent this weekend giving my Blog a work-over and extending the page set about myself – something I have been putting off for 5 years!

I now have a proper Website for Gingertech and I finally have the list of my publications up.

I have quite an extensive list of publications, so have been wondering for a while how to publish them such that it will be easy to manage. I have used a nice little WordPress plugin called “List Subpages” to get the hierarchical set of pages onto the site in a nicely structured way. I’m actually missing a RSS feed of these pages, but that’s not a major problem. I’m happy with what this provides for now – it’s a good start. It will make those people happy that have been bugging me for my publications and to whom my sad reply has always been to send them an extract of my CV.

I still need to add some of the abstracts of the publications into the posts, and complete the links to where you can download them. More work for further weekends. This is a good start for now and I am happy with it.

Metavidwiki gone public

The revolution is here and now! If you thought you’ve seen it all with video web technology, think again.

Michael Dale and Aphid (Abram Stern) have published a plugin for Mediawiki called Metavidwiki which is simply breathtaking.

It provides all of the following features:

  • wiki-style timed annotations including links to other resources
  • a cool navigation interface for video to annotated clips
  • plain text search for keywords in the annotations
  • search result display of video segments related to the keywords with inline video playback
  • semantic search using speaker and other structured information
  • embedding of full video or select clips out of videos into e.g. blogs
  • web authoring of mashups of select clips from diverse videos
  • embedding of these mashups (represented as xspf playlists)
  • works with Miro through providing media RSS feeds

Try it out and be amazed! It should work in any browser – provide feedback to Michael if you discover any issues.

All of Metavidwiki is built using open standards, open APIs, and open source software. This give us a taste of how far we can take open media technology and how much of a difference it will make to Web Video in comparison to today’s mostly proprietary and non-interoperable Web video applications.

The open source software that Metavidwiki uses is very diverse. It builds on Wikipedia’s Mediawiki, the Xiph Ogg Theora and Vorbis codecs, a standard LAMP stack and AJAX, the Annodex apache server extension mod_annodex, and is capable of providing the annotations as CMML, ROE, or RSS. Client-side it uses the capabilities of your specific Web browser: should you run the latest Firefox with Ogg Theora/Vorbis support compiled in, it will make use of this special capability. Should you have a vlc browser plugin installed, it will make use of that to decode Ogg Theora/Vorbis. The fallback is the java cortado player for Ogg Theora/Vorbis.

Now just imagine for a minute the type of applications that we will be able to build with open video APIs and interchangable video annotation formats, as well as direct addressing of temporal and spatial fragments of media across sites. Finally, video and audio will be able to become a key part in the picture of a semantic Web that Tim Berners-Lee is painting – a picture of open and machine-readable information about any and all information on the Web. We certainly live in exciting times!!!

The nature of CMML

Today, for the millionth time I had to listen to a statement that goes along the following lines: “CMML technology is not ideal for media annotations, because the metadata is embedded with the object rather than separate”.

For once and all let me shout it out: THIS IS UTTER BULLSHIT!

I am so sick of hearing this statement from people who criticise CMML from a position of complete lack of understanding. So, let me put it straight.

While it is true that CMML has the potential to be multiplexed as a form of timed text inside a media file, the true nature of CMML is that it is versatile and by no means restricted to this representation.

In fact, the specification document for CMML is quite clearly a specification of a XML document. CMML is in that respect more like RSS than a timed text format.

Further, I’ll let you in on a little secret: CMML can be stored in databases. Yes!! In fact, CMMLWiki, one of the first online media applications that were implemented using Annodex, uses a mysql database to store CMML data. The format in which it can be extracted depends on your needs: you can get out single field content, you can put it in an interchangeable XML file (called CMML), or you can multiplex it with the media data into an Annodex file.

The flexibility of CMML is it’s beauty! It was carefully designed to allow it to easily transform between these different representations. It’s powerful because it can easily appear in all these different formats. By no means is this “not ideal”.

Australian Startup Carnival

Vquence was today presented on the “Australian Startup Carnival” site – go, check it out.

There are 28 participants to the startup carnival and each one of them is being introduced through an interview that was taken electronically. Questions for this interview were rather varied and detailed. They included technical and system backgrounds as well as asking for your use of open source software.

All the questions you have always wanted to ask about Vquence, and a few more. 😉

UPDATE: The Startup Carnival has announced the prizes and they are amazing – first prize being an exhibition package at CeBIT. Good luck to us all!!

Vquence: Measuring Internet Video

I have been so busy with my work as CEO of Vquence since the end of last year that I’ve neglected blogging about Vquence. It’s on my list of things to improve on this year.

I get asked frequently what it is that we actually do at Vquence. So here’s an update.

Let me start by providing a bit of history. At the beginning of 2007 Vquence was totally focused on building a social video aggregation site. The site now lives at http://www.vqslices.com/ and is useful, but lacks some of the key features that we had envisaged to have a breakthrough.

As the year grew older and we tried to create a corporate business and an income with our video aggregation, search and publication technology, we discovered that we had something that is of much higher value than the video handling technology: we had quantitative usage information about videos on social video sites in our aggregated metadata. In addition, our “crawling” algorithms, are able to supply up-to-date quantitative data instantly.

In fact, I should not simply call our data acquisition technology a “crawler” because in the strict sense of the word, it’s not. Bill Burnham describes in his blog post about SkyGrid the difference between crawlers of traditional search engines and the newer “flow-based” approach that is based on RSS/ping servers. At Vquence we are embracing the new “flow-based” approach and are extending it by using REST APIs where available. A limitation of the flow-based approach is that just a very small part of the Web is accessible through RSS and REST APIs. We therefore complement flow-based search with our own new types of data-discovery algorithms (or “crawlers”) as we see fit. In particular: locating the long tail of videos stored on YouTube is a challenge that we have mastered.

But I digress…

So we have all this quantitative data about social videos, which we update frequently. With it, we can create graphs of the development of view counts, comment counts, video replies and such. See for example the below image for a graph that compares the aggregate view count of the videos that were published by the main political parties in Australia during last year’s federal election. The graph shows the development of the view count over the last 2.5 months before the election in 2007.

Aggregate Viewcount Graph Federal Election Australia

At first you will notice that Labor started far above everyone else. Unfortunately we didn’t start recording view counts that early, but we assume it is due to the Kevin07 website that was launched on 7th August. In the graph, you will notice a first increase on the coalition’s view count on the 2nd September – that’s when Howard published the video for the APEC meeting 2-9 Sept 2007. Then there’s another bend on the 14th September, when Google launched it’s federal election site and we saw first videos of the Nationals going up on YouTube. The dip in the curve of the Nationals a little after that is due to a software bug. Then on the 14th October the Federal Election was actually announced and you can see the massive increase in view count from there on for all parties, ending with a huge advantage of Labor over everybody else. Interestingly enough, this also mirrors the actual outcome of the election.

So, this is the kind of information that we are now collecting at Vquence and focusing our business around.

On that background, check out a recent blog post by Judah Phillips on “Thinking about Measuring Internet Video?”. It is actually a wonderful description of the kind of things we are either offering or working on.

Using his vocabulary: we can currently provide a mix of Instream and Outstream KPI to the video advertising market. Our larger aim is to provide outstream audience metrics that are exceptional and we know how to get them regardless of where the video goes on the Internet. Our technology plan centers around a mix of a panel-based approach (through a browser plugin) and a census-based approach (through a social network plugin for facebook et al, also using OpenID), and video duplicate identification.

This information isn’t yet published at our corporate website, which still mostly focuses on our capabilities in video aggregation, search, and publication. But we have a replacement in the making. Watch this space… 🙂

Activities for a possible Web Video Working Group

The report of the recent W3C Video on the Web workshop has come out and has some recommendations to form a Video Metadata Working Group, or even more generally a Web Video Working Group.

I had some discussions with people that have a keen interest in the space and we have come up with a list of topics that a W3C Video Working Group should look into. I want to share this list here. It goes into somewhat more detailed than the topics that the W3C Video on the Web workshop has raised. Feel free to add any further concerns or suggestions that you have in the comments – I’d be curious to get feedback.

First, there are the fundamental issues:

  • Choice of royalty-free baseline codecs for audio and video
  • Choice of encapsulation format for multi-track media delivery

Both of these really require the generation of a list of requirements and use cases, then analysis of existing format with respect to these requirements and finally a decision on which ones to use.

Requirements for codecs would encompass, amongst others, the need to cover different delivery and receiving devices – from mobile phones with 3G bandwidth, over Web video, to full-screen TV video over ADSL.

Here are some requirements for an encapsulation format:

  • usable for live streaming and for canned delivery,
  • the ability to easily decode from any offset in a media file,
  • the use for temporal and spatial hyperlinking and the required partial delivery that comes with these,
  • the ability to dynamically create multi-track media streams on a server and to deliver requested tracks only,
  • the ability to compose valid streams by composing segments from different servers based on a (play)list of temporal hyperlinks,
  • the ability to cache segments in the network,
  • and the ability to easily add a different “codec” track into the encapsulation (as a means of preparing for future improved codecs or other codec plugins).

The decisions for an encapsulation format and for a/v codecs may potentially require a further specification of how to map specific codecs into the chosen encapsulation format.

Then we have the “Web” requirements:

The technologies that have created what is known as the World Wide Web are fundamentally a hypertext markup language (HTML), a hypertext transfer protocol (HTTP) and a resource addressing scheme (URIs). Together they define the distributed nature of the Web. We need to build an infrastructure for hypermedia that builds on the existing Web technologies so we can make video a first-class citizen on the Web.

  • Create a URI-compatible means of temporal hyperlinking directly into time offsets of media files.
  • Create a URI-compatible means of spatial hyperlinking directly into picture areas of video files.
  • Create a HTTP-compatible protocol for negotiating and transferring video content between a Web server and a Web client. This also includes a definition of how video can be cached in HTTP network proxies and the like.
  • Create a markup language for video that also enables hyperlinks from any time and region in a video to any other Web resource. Time-aligned annotations and metadata need to be part of this, just like HTML annotates text.

All of these measures together will turn ordinary media into hypermedia, ready for a distributed usage on the Web.

In addition to these fundamental Web technologies, to integrate into modern Web environments, there would need to be:

  • a standard definition of a javascript API to interact with the media data,
  • an event model,
  • a DOM integration of the textual markup,
  • and possibly the use of CSS or SVG to define layout, effects, transitions and other presentation issues.

Then there are the Metadata requirements:

We all know that videos have a massive amount of metadata – i.e. data about the video. There are different types of metadata and they need to be handled differently.

  • Time-aligned text, such as captions, subtitles, transcripts, karaoke and similar text.
  • Header-type metadata, such as the ID3 tags for mp3 files, or the vorbiscomments for Ogg files.
  • Manifest-type description of the relationships between different media file tracks, similar to what SMIL enables, like the recent ROE format in development with Xiph.

The time-aligned text should actually be regarded as a codec, because it is time-aligned just like audio or video data. If we want to be able to do live streaming of annotated media content and receive all the data as a multiplexed stream through one connection, we need to be able to multiplex the text codec into the binary stream just like we do with audio and video. Thus, the definition of the time-aligned text codecs have to ascertain the ability to multiplex.

Header-type metadata should be machine accessible and available for human consumption as required. They can be used to manage copyright and other rights-related information.

The manifest is important for dynamically creating multi-track media files as required through a client-server interaction, such as the request for a specific language audio track with the video rather than the default.

Other topics of interest:

There are two more topics that I would like to point out that require activities.

  • “DRM”: It needs to be analysed what the real need is here. Is it a need to encrypt the media file such that it can only be read by specific recipients? Maybe an encryption scheme with public and private keys could provide this functionality? Or is it a need to retain copyright and licensing information with the media data? Then the encapsulation of metadata inside the media files may be a good solution already, since this information stays with the media file after a delivery or copy act.
  • Accessibility: It needs to be ascertained that the association of captions, sign language, video descriptions and the like in a time-aligned fashion to the video is possible with the chosen encapsulation format. A standard time-aligned format for specifying sign language would be needed.

This list of required technologies has been built through years of experience experimenting with the seamless integration of video into the World Wide Web in the Annodex project and through further recent discussions from the W3C Video on the Web workshop and elsewhere.

This list is just providing a structure towards what is necessary to address in making video a first-class citizen on the Web. There are many difficult detail problems to solve in each one of these areas. It is a challenge to understand the complexity of the problem, but I hope this structure can contribute to break down some of the complexity and help us to start attacking the issues.