All posts by silvia

Editing video for LCA

I’ve just finished writing a small script that will help us edit and transcode video recorded at LCA 2007. Since we will record directly to DVD, we will need a simple laptop with a DVD drive and the installed software gmplayer and ffmpeg2theora to do the editing and transcoding, before uploading to a Web server. Which means that just about all LCA participants are potential helpers for making sure the video material gets published on the same day.

If you happen to be a LCA participant and want to help ascertain video publishing happens, please walk up to the video guys and offer your editing & transcoding help.

Ah yes, and here is the script – in case anyone is interested:


#!/bin/sh

# usage function
function func_usage () {
echo "Usage: VOB2Theora.sh "
echo " starttime/endtime given as HH:MM:SS"
echo " filename input file for conversion"
}

# convert from SMPTE to seconds
function func_convert2sec () {
tspec=$1;
tlen=${#tspec} #strlen

# parse seconds out of string
tsecstart=$[ ${tlen} - 2 ]
tsec=${tspec:$tsecstart:2} #substr

# parse minutes
tminstart=$[ ${tlen} - 5 ]
if test $tminstart -ge 0; then
tmin=${tspec:$tminstart:2} #substr
else
tmin=0
fi

# parse hours
thrsstart=$[ ${tlen} - 8 ]
if test $thrsstart -ge 0; then
thrs=${tspec:$thrsstart:2} #substr
else
thrs=0
fi

# calculate number of seconds from hrs, min, sec
tseconds=$[ $tsec + (($tmin + ($thrs * 60)) * 60) ]
}

# test number of parameters of script
if test $# -lt 3; then
func_usage
exit 0
fi

# convert start time
func_convert2sec $1
tstart=$tseconds

# convert end time
func_convert2sec $2
tstop=$tseconds

# input file
inputfile=$3
if test -e $inputfile; then
echo "Converting $3 from $tstart sec to $tstop sec ..."
echo ""
else
echo "File $inputfile does not exist"
exit 1;
fi

# convert using ffmpeg2theora
strdate=`date`;
strorga="LCA";
strcopy="LCA 2007";
strlicense="Creative Commons BY SA 2.5";
strcommand="ffmpeg2theora -s $tstart -e $tstop --date '$strdate' --organization '$strorga' --copyright '$strcopy' --license '$strlicense' --sync $inputfile"

echo $strcommand;

sh -c "$strcommand";

Ralph Giles made an improved version of the VOB2Theora script. It can be found at http://mirror.linux.org.au/linux.conf.au/2007/video/VOB2Theora_v2.sh.

Why we need a open media developer conference

Have you ever been stuck with a video file that does not play in any of your video players or the Web Browser? It happens frequently because the media technology landscape is still a very fragmented one where a lot of energy is put into the creation of proprietary compression technologies. But the consumer is unwilling to follow every new encoding format and to pay for codecs which he/she may only need for this one file.

Just as the use of free and unencumbered text encoding formats (ASCII, UTF-8) is a prerequisite to the development of novel applications and an enabler of email, the Web, and many other common applications, free video and audio formats enable the creation of novel applications with media.

Free and unencumbered codecs are starting to become mature. The codecs from Xiph.org cover audio (Vorbis, Speex, FLAC) and video (Theora) are readily available and supported on many platforms. The BBC’s next-generation video codec called Dirac is still in the labs, but is one of the few cutting-edge codecs built on Wevelets, a novel transform that promises higher compression rates with less artefacts – and it is free and unencumbered.

However, the availability of codecs is not all that matters. Audio-visual applications that make use of these codecs need to be developed, too. Applications such as video editors, desktop audio/video players, Webbrowser embedded players, and streaming technology are fundamental to enable the full production-to-publishing chain. And then there are the higher-level applications such as playlist and collections manager (iTunes-like), video Web hosting, video search, or Internet video conferencing applications which provide the real value to people.

Foundations of Open Media Software is the first conference ever to bring together the architects of open media software systems from around the world to address technical issues and further the development of a open media ecology where the focus is on the development of new high-value applications rather than a tiring and unproductive competition of formats.

FOMS furthers the development of media technology on Linux, addresses support of open media codecs across platforms, and works towards the creation of an ecosystem of rich media applications.

The principles of creative commons content around a free exchange of ideas through digital media requires adequate licenses to be attached to media files, which in turn will only work in an environment where the media formats of such content is unrestricted and unencumbered, too.

Foundations of Open Media Software takes place in Sydney, Australia 11th-12th January 2007. Since it is a conference organised by developers for developers, donations are highly welcome. There are also some spaces for professional delegates available still. Details are at http://www.annodex.org/events/foms2007/ .

FOMS: Foundations of Open Media Software

From Thursday 11 – Friday 12 January 2007 we will have some of the world’s top open media software developers gather in Sydney at a workshop titled “Foundations of Open Media Software” (FOMS).

The workshop takes place in the week before linux.conf.au (LCA), thus enabling developers to cross-pollinate with the developers and attendants of LCA. FOMS is supported by LCA with venue and other logistics.

I’m happy to be one of the core organisers of this workshop and very excited about the vibe that this will bring to the open media software developer community. FOMS creates a venue for a community that has thus far not had its own gathering place.

In January: Sydney will rock the FOSS world doubly!!

$1.65bn for YouTube – will Google now finally offer video search?

No, Google do not offer a video search service, don’t be blinded. Time and time again I have to explain that Google’s video.google.com is a video hosting service, not a horizontal video search service. They do not follow their own mission with Google video, but offer search only on their own collection of content, i.e. they offer vertical search and not search on “the world’s” video, which is what horizontal search is about.

And now they have acquired YouTube – btw: this was a really cheap deal, too, through a masterly financial stragey. But I diverge – I am not a market analyst, but a technologist. And I want to share what I see as an immense opportunity for Google in this deal.

Let me go back in history: Google started video.google.com because there was not enough video content on the Web and thus a dedicated video search engine didn’t make much sense. So they ran a dual strategy to get content on the Web: they made it simple for consumers to upload their content thus starting the wave of consumer-created (and consumer-mediated) content. And they mediated content from the old media industry to go online. This instantly put Google into the video hosting business.

Fast-forward a year and you find YouTube did a better job at providing consumers with a video hosting service. So, Google buys them. With what intention? To have a second video hosting business? Maybe… but to be quite honest, I have a different take on this.

This is the chance for Google to turn the Google brand away from video hosting and back to horizontal video search. With YouTube they have a channel to move their existing corporate customers and their upload users to a more successful hosting site. Then they can get their core brand back into search.

Bah – gotta get back to coding, so our company is ready for when the time comes!

Best practice in Web video publication

I’ve spent a lot of time recently analysing video on the Web.

YouTube and Google Video introduced what seems to be the standard now: videos get published on what I like to call a “host page”. This is one webpage completely dedicated to this video.

Why are there still so many videos out there that get published through a hyperlink behind two words of text instead of giving them proper recognition?

Think about it: the creation of a video usually costs a lot of effort and when it’s done, it needs a proper presentation. Hiding it behind a hyperlink is like putting your blog up on an ftp server in pdf format.

So, what information has to be on a video host page?

Best practice is to have an embeddable video player on the host page that displays a keyframe.

Other information that typically resides on a host page is a short textual description of the video, its duration, who published it, who created it, license rights (check out Creative Commons for this), tags & category attributions, comments from viewers, number of page views, and a description of how to use this thing in other environments, such as how to embed it in blogs or how to download it to the iPod or PSP.

We don’t need Google or YouTube to do this for us. We can publish video in that way ourselves. Well, maybe apart from the bit about transcoding to the iPod or PSP. Incidentally, is there any open source SW around to do that?

We can transcode our videos to Ogg Theora using ffmpeg2theora and then publish it with the embedded java theora player Cortado. Then we just need to create our own host page in html.

All we need now are a few more plugins for common Web content management systems like WordPress or drupal to simplify this process even more. Here’s your Friday afternoon challenge. 🙂

World progress by branching

If you’ve ever worked with a version control system like cvs or svn, you will understand what branches are: They are the places – away from the trunk – where a new idea is tested and developed to maturity until it can either merge with the trunk or even replace the trunk.

It occured to me today that all of the world’s progress happens in exactly this way. We have a new idea, but it is not really ripe yet to be introduced into the trunk of thinking (main stream), or the trunk of technology (status quo), or the trunk of a company (core business).

It is our choice

We can now take one of two routes: we can either try and grow it from within the trunk. Or we can branch.

Depending on how flexible the trunk is and how open and embracing the people that are part of it, building from within may well be the best means.

However, if our idea is a little more revolutionary and replaces some of the traditional thinking, we will have a hard time. We might as well be brave and start a new branch to develop and demonstrate our ideas. And if enough people adopt the branch over the trunk, the branch will eventually be merged back into it or replace it completely.

This is a healthy process and helps make the best possible progress.

Don’t be afraid of branching

I have many experiences from standards bodies, companies, research groups, and social groups, where this idea of healthy branching has not been applied and thus the groups have failed.

One way to fail is to succumb to social pressure and just forget about our idea, i.e. not daring to start a branch.

In a sports team, such behaviour has led to the loss of games, because it is destructive to our own psyche not to follow our conviction – and thus it is destructive to the team.

In R&D teams it has led to the loss of IP because the novel idea was then had and demonstrated somewhere else.

In contrast, one persistent ski jumper has managed to bring a revolution to the technique of ski jumping and eventually caused a change of rules (changed the trunk).

Trunks are vital

Another way to fail is to completely ignore the necessity of a trunk and start developing in branches only, with the hope that at some point the branches will grow together and magically create a trunk.

I’ve seen this happen over and over in research teams and in large standards bodies. It is amazing how much high intelligence can be wasted in this way!

Heroes

It takes a lot of courage to start a branch, because such activities often lack the support of the trunk people.

Trunk people should learn to value branch activities more highly, and not belittle or demonise them. After all, they were often branch people earlier and should know about the value such activities present to progress.

Branch people should learn to value the healthiness of a strong trunk and not wave it aside as a small issue that will solve itself eventually as all branches grow together. Because – branches don’t usually grow together – and if they do, the resulting tree is usually a cripple.

Making your video discoverable

Videos will be everywhere on the web! Yes, cope with it: soon the majority of videos won’t be with some hosting site like youtube, but it will reside on our private servers, on company servers, actually on any and all web servers. And there will be interesting stuff, but it will be hard to find.

Yes, history will repeat itself again and finding those videos on the Web that satisfy our need – be it for information or entertainment – will be a nightmare. Why? Because google’s pagerank (and many other ranking algorithms) rely on Web pages pointing to the videos to give them a higher rank. However, the way in which videos are currently published is through embedding them into Web pages (let’s call such a page the “embedding page”). Thus, the link analysis will actually return the pagerank for the embedding page – but not for the video itself!

Now, if the embedding page can actually be seen as representative for the video because the only reason that the webpage exists is to publish the video and its annotations, then the pagerank for the embedding page is actually the same as the pagerank for the video. This is the case for google video and for youtube and for many other hosting sites.

However, you and I mostly publish our videos in blogs or on Web pages that describe more than just the video – some will even have several videos embedded. This is where the chaos for a Web search engine for videos begins. And this is where the discoverability of your videos through video search engines ends.

Here is the solution.

Just as we do with normal Web pages, we have to introduce SEO (search engine optimisation) for videos. That means, we have to make it easier for the search engines to find out information about our videos, i.e. to index and rank them.

Because videos are binary data, a common Web search engine cannot extract information about this Web resource directly from it (let’s ignore signal analysis and automatic content analysis approaches for the moment). We have to help the search engine.

The solution is to have a text file sitting “next” to the actual video file which contains indexable text about the video. It will have all the annotations, meta data, tags, copyright information and other textual meta information that search engines require to index and rank it better. This text file is an indexable textual representation of the video.

So, whenever a video search engine reaches a video in a crawl, it will check out this text file for its indexing work. If this text file is HTML, then people may link directly to it and it will be included in the pagerank calculations again. If it is a XML file, there should be a simple way to transcode it to HTML, e.g. via a xslt script, so links can go there directly again.

So much for the theory: here comes the practice.

For every video file (and incidentally it would work for audio, too), you should start writing a CMML file and publish it on your Web server together with the original. Here is a xslt script that you can use to transcode CMML to HTML. If you actually use Ogg Theora as your Video publishing format, you can even publish Annodex videos and make direct access to the clips that you defined in CMML and to time offsets possible by using the Apache Annodex module. Try using it in your blog with the external embedding of the Annodex Firefox extension.

When we’ve done this, all that remains is to encourage the video search engines to exploit the CMML data in their crawls. 🙂

An “explosion” of online video?

So you think we’re in the middle of an “explosion” of online video clips, in particular consumer-created video clips? Think again. How many videos have you published online so far? Compare that to the number of web pages you have written or contributed to.

It’s still only very few people who upload clips. The “masses” haven’t even decided to start yet.

The “mass” consists of all the people who see something useful in uploading, making accessible, and finding video clips (and no, that’s not just pr0n). It took the Web a few years before companies started having a Web presence and to use the Web as a marketing instrument. It took private people even longer before they started having blogs and publish their cv and photo collections.

Videos can be used as much as a marketing instrument as a Web page. In a convergent world, videos will even be more important than text because it reaches the couch potato. People will start making videos about their success story in gardening, about their home-grown cooking receipe, about the way to repair a special valve on their car, about how to train pets – or children (“be your own super-nanny”). Small companies will make videos about their products, the corner-shop will advertise its services to the neighbourhood, the medical centre will present its doctors and procedures through online videos, the computer shop its software, the travel agency its best locations etc. The video explosion on the Web hasn’t even started yet.

Running flumotion on Ubuntu

Flumotion is a streaming server product developed by Fluendo. Flumotion runs in a distributed environment, where the video capture, encoding, and transmission can be run on different computers, so the load can be better balanced.

I have found it rather difficult to find an introductory help on how to get flumotion set up and running, so I’ll share my insights with you here.

Imagine a setup where you want machine A to capture and encode the video from a DV camera, machine B relaying the stream onto the Internet to several clients, and machine C getting the stream off machine B and writing it to disk. The software that you’d need to run on each of these machines is the following:

  1. Run flumotion-manager on machine B. flumotion-manager is the central component of a flumotion streaming setup, which connects up all the components and makes sure that everything works. It has to run before anything else can happen.
  2. Run flumotion-worker on every machine where you want work to be done, i.e. on machine A, B, and C. The workers are demons that connect to the manager and wait for commands to do something.
  3. Run flumotion-admin on any machine to set up the details of the flumotion streaming setup.

So, here are the commands, that I use to get it running using the default setup:

  1. flumotion start
    (which will run flumotion-manager -D -n default /etc/flumotion/managers/default/planet.xml for you).
  2. flumotion-worker -u pants -p off &
    (yes, these are the default user name and password :).
  3. flumotion-admin
    (and go through the GUI setup wizard).

… and you should be up and going with either your DV camera, your Webcam or your TV tuner card. Watch the cute smileys go happy! And connect to the stream using your favorite media player that can decode Ogg Theora/Vorbis, e.g. totem, vlc, xine.

I’ve found online man pages of flumotion-manager, flumotion-worker, and flumotion-admin helpful, because the flumotion package that my Ubuntu dapper installation installed did not have them. You might actually be better off using Jeff Waugh’s packages for each of the flumotion commands if you are setting up on Ubuntu Dapper. Another hint: use the library theora-mmx to get better performance.

Flumotion is an excellent solution to setting up video streaming. I have found the following conferences have used it before:

  • GUADEC, June 2006, http://guadec.org/GUADEC2006/Live
  • DebConf, May 2006, http://technocrat.net/d/2006/5/12/3384
  • Linux Audio Conference, May 2006, http://lac.zkm.de/2006/streaming.shtml
  • Washington DC LUG, http://dclug.tux.org/webcast/

A blog for Silvia

…and it’s about time, they say.

Expect comments on the world of digital media, the world of free and open source software, and visions of the future here.

I’m not going to out me here personally. If you want to know what I’ve been up to over the last weekend – come and visit me, give me a call or grab me on irc. I like personal contact.