Category Archives: analytics

HTML5 video: 25% H.264 reach vs. 95% Ogg Theora reach

Vimeo started last week with a HTML5 beta test. They use the H.264 codec, probably because much of their content is already in this format through the Flash player.

But what really surprised me was their claim that roughly 25% of their users will be able to make use of their HTML5 beta test. The statement is that 25% of their users use Safari, Chrome, or IE with Chrome Frame. I wondered how they got to that number and what that generally means to the amount of support of H.264 vs Ogg Theora on the HTML5-based Web.

According to Statcounter’s browser market share statistics, the percentage of browsers that support HTML5 video is roughly: 31.1%, as summed up from Firefox 3.5+ (22.57%), Chrome 3.0+ (5.21%), and Safari 4.0+ (3.32%) (Opera’s recent release is not represented yet).

Out of those 31.1%,

8.53% browsers support H.264


27.78% browsers support Ogg Theora.

Given these numbers, Vimeo must assume that roughly 16% of their users have Chrome Frame in IE installed. That would be quite a number, but it may well be that their audience is special.

So, how is Ogg Theora support doing in comparison, if we allow such browser plugins to be counted?

With an installation of XiphQT, Safari can be turned into a browser that supports Ogg Theora. The Chome Frame installation will also turn IE into a Ogg Theora supporting browser. These could get the browser support for Ogg Theora up to 45%. Compare this to a claimed 48% of MS Silverlight support.

But we can do even better for Ogg Theora. If we use the Java Cortado player as a fallback inside the video element, we can capture all those users that have Java installed, which could be as high as 90%, taking Ogg Theora support potentially up to 95%, almost up to the claimed 99% of Adobe Flash.

I’m sure all these numbers are disputable, but it’s an interesting experiment with statistics and tells us that right now, Ogg Theora has better browser support than H.264.

UPDATE: I was told this article sounds aggressive. By no means am I trying to be aggressive – I am stating the numbers as they are right now, because there is a lot of confusion in the market. People believe they reach less audience if they publish in Ogg Theora compared to H.264. I am trying to straighten this view.

View counts on YouTube contradictory

UPDATE (6th February 2010): YouTube have just reacted to my bug and it seems there are some gData links that are more up-to-date than others. You need to go with the “uploads” gData APIs rather than the search or user ones to get accurate data. Glad YouTube told me and it’s documented now!

I am an avid user of YouTube Insight, the metrics tool that YouTube provides freely to everyone who publishes videos through them. YouTube Insight provides graphs on video views, the countries they originate in, demographics of the viewership, how the videos are discovered, engagement metrics, and hotspot analysis. It is a great tool to analyse the success of your videos, determine when to upload the next one, find out what works and what doesn’t.

However, you cannot rely on the accuracy of the numbers that YouTube Insight displays. In fact, YouTube provides three different means to find out what the current views (and other statistics, but let’s focus on the views) are for your videos:

  • the view count displayed on the video’s watch page
  • the view count displayed in YouTube Insight
  • the view count given in the gData API feed

The shocking reality is: for all videos I have looked at that are less than about a month old and keep getting views, all three numbers are different.

Sometimes they are just off by one or two, which is tolerable and understandable, since the data must be served from a number of load balanced servers or even server clusters and it would be difficult to keep all of these clusters at identical numbers all of the time.

However, for more than 50% of the videos I have looked at, the numbers are off by a substantial amount.

I have undertaken an analysis with random videos, where I have collected the gData views and the watch page views. The Insight data tends to be between these two numbers, but I cannot generally reach that data, so I have left it out of this analysis.

Here are the stats for 36 randomly picked videos in the 9 view-count classes defined by TubeMogul and by how much they are off at the time that I looked at them:

Class Video watch page gData API age diff percentage
>1M 1 7,187,174 6,082,419 2 weeks 1,104,755 15.37%
>1M 2 3,196,690 3,080,415 3 weeks 116,275 3.63%
>1M 3 2,247,064 1,992,844 1 week 254,220 11.31%
>1M 4 1,054,278 1,040,591 1 month 13,687 1.30%
100K-500K 5 476,838 148,681 11 days 328,157 68.82%
100K-500K 6 356,561 294,309 2 weeks 62,252 17.46%
100K-500K 7 225,951 195,159 2 weeks 30,792 13.63%
100K-500K 8 113,521 62,241 1 week 51,280 45.17%
10K-100K 9 86,964 46 4 days 86,918 99.95%
10K-100K 10 52,922 43,548 3 weeks 9,374 17.71%
10K-100K 11 34,001 33,045 1 month 956 2.81%
10K-100K 12 15,704 13,653 2 weeks 2,051 13.06%
5K-10K 13 9,144 8,967 1 month 117 1.94%
5K-10K 14 7,265 5,409 1 month 1,856 25.55%
5K-10K 15 6,640 5,896 2 weeks 744 11.20%
5K-10K 16 5,092 3,518 6 days 1,574 30.91%
2.5K-5K 17 4,955 4,928 3 weeks 27 0.91%
2.5K-5K 18 4,341 4,044 4 days 297 6.84%
2.5K-5K 19 3,377 3,306 3 weeks 71 2.10%
2.5K-5K 20 2,734 2,714 1 month 20 0.73%
1K-2.5K 21 2,208 2,169 3 weeks 39 1.77%
1K-2.5K 22 1,851 1,747 2 weeks 104 5.62%
1K-2.5K 23 1,281 1,244 1 week 37 2.89%
1K-2.5K 24 1,034 984 2 weeks 50 4.84%
500-1K 25 999 844 6 days 155 15.52%
500-1K 26 891 790 6 days 101 11.34%
500-1K 27 861 600 3 days 17 30.31%
500-1K 28 645 482 4 days 163 25.27%
100-500 29 460 436 10 days 24 5.22%
100-500 30 291 285 4 days 6 2.06%
100-500 31 256 198 3 days 58 22.66%
100-500 32 196 175 11 days 21 10.71%
0-100 33 88 74 10 days 14 15.90%
0-100 34 64 49 12 days 15 23.44%
0-100 35 46 21 5 days 25 54.35%
0-100 36 31 25 3 days 4 19.35%

The videos were chosen such that they were no more than a month old, but older than a couple of days. For older videos than about a month, the increase had generally stopped and the metrics had caught up, unless where the views were still increasing rapidly, which is an unusual case.

Generally, it seems that the host page has the right views. In contrast, it seems the gData interface is updated only once every week. It further seems from looking at YouTube channels where I have access to Insight that Insight is updated about every 4 days and it receives corrected data for the days in which it hadn’t caught up.

Further, it seems that YouTube make no differentiation between channels of partners and general users’ channels – both can have a massive difference between the watch page and gData. Most videos differ by less than 20%, but some have exceptionally high differences above 50% and even up to 99.95%.

The difference is particularly pronounced for videos that show a steep increase in views – the first few days tend to have massive differences. Since these are the days that are particularly interesting to monitor for publishers, having the gData interface lag behind this much is shocking.

Further, videos with a low number of views, in particular less than 100, also show a particularly high percentage in difference – sometimes an increase in view count isn’t reported at all in the gData API for weeks. It seems that YouTube treats the long tail worse than the rest of YouTube. For every video in this class, the absolute difference will be small – obviously less than 100 views. With almost 30% of videos being such videos, it is somewhat understandable that YouTube are not making the effort to update their views regularly. OTOH, these views may be particularly important to their publishers.

It seems to me that YouTube need to change their approach to updating statistics across the watch pages, Insight and gData.

Firstly, it is important to have the watch page, Insight and gData in sync – otherwise what number would you use in a report? If the gData API for YouTube statistics lags behind the watch page and Insight by even 24 hours, it is useless in indicating trends and for using in reports and people have to go back to screenscraping to gain information on the actual views of their videos.

Secondly, it would be good to update the statistics daily during the first 3-4 weeks, or as long as the videos are gaining views heavily. This is the important time to track the success of videos and if neither Insight nor gData are up to date in this time, and can even be almost 100% off, the statistics are actually useless.

Lastly, one has to wonder how accurate the success calculations are for YouTube partners, who rely on YouTube reporting to gain payment for advertising. Since the analysis showed that the inaccuracies extend also into partner channels, one has to hope that the data that is eventually reported through Insight is actually accurate, even if intermittently there are large differences.

Finally, I must say that I was rather disappointed with the way in which this issue has so far been dealt with in the YouTube Forums. The issues about wrongly reported view counts has been reported first more than a year ago and since regularly by diverse people. Some of the reports were really unfriendly with their demands. Still, I would have expected a serious reply by a YouTube employee about why there are issues and how they are going to be fixed or whether they will be fixed at all. Instead, all I found was a more than 9 month old mention that YouTube seems to be aware of the issue and working on it – no news since.

Also, I found no other blog posts analysing this issue, so here we are. Please, YouTube, let us know what is going on with Insight, why are the numbers off by this much, and what are you doing to fix it?

NB: I just posted a bug on gData, since we were unable to find any concrete bugs relating to this issue there. I’m actually surprised about this, since so many people reported it in the YouTube Forums!

Top 10 commercials for 2008 on YouTube

I spent the last few days doing some nice research for Vquence, where I was able to watch lots of videos on YouTube. Fun job this is! 🙂 The full article is on the Vquence metrics blog.

One of the key things that I’ve put together is a list of top 10 commercials for 2008:

Rank Video Views Added
1 Pepsi – SoBe Lifewater Super Bowl 2008 3,652,217 February 02, 2008
2 Cadbury – Gorilla 3,338,011 August 31, 2007
3 Nike – Take it to the NEXT LEVEL 3,184,329 April 28, 2008
4 Macbook Air 2,648,717 January 15, 2008
5 Centraal Beheer Insurance – Gay Adam 2,512,425 May 30, 2008
6 Vodafone – Beatbox 2,380,237 March 17, 2008
7 E*Trade – Trading Baby 2,061,818 February 01, 2008
8 Guitar Hero – Heidi Klum 1,068,055 November 03, 2008
9 Bridgestone – Scream 980,406 January 30, 2008
10 Bud Light- Will Ferrell 966,177 February 04, 2008
Favorable mention OLPC – John Lennon 527,953 December 25, 2008
Favorable mention Blendtec – iPhone 3G 2,711,195 July 11, 2008
Favorable mention Stide Gum – Where the hell is Matt? 15,859,204 June 20, 2008

There are many more details over at

Enjoy! And let me know in the comments if you know of any other video ad released in 2008 in the same ballpark number of views that is an actual tv-style commercial.

NOTE: I just had to change the list, because the SoBe Lifewater Super Bowl ad of 2008 actually came out ahead. It’s difficult to discover an ad that has neither ad nor commercial in its annotations!

Resurrecting old Maaate code

Have you ever been haunted by an old open source package that you wrote once, published, and then forgot about?

The BSD community has just reminded me of the MPEG audio analysis toolkit Maaate that I wrote at CSIRO when I first came to Australia and that was then published through the CSIRO Mathematical and Information Sciences division.

The BSD guys were going to remove it from their repositories, because since I left CSIRO more than 2 years ago, CSIRO has taken down the project pages and the code, so there were no active project pages available any longer. I’m glad they contacted me before they did so.

Since it is an open source project, I have now resurrected the old pages at Sourceforge. They are available from I have re-instated the relevant weg pages and documentation and updated all the links. I discovered that we did some cool things then and that it may indeed be worth preservation for the future. I expect Sourceforge is up to the task.

Thanks very much, BSD community and welcome back, MPEG Maaate!

to_bool rails plugin

In our rail application we do a lot of string conversions to other data types, including Boolean. Unfortunately, ruby does not provide a conversion method to_bool (which I find rather strange, to be honest).

Based a blog post by Chris Roos in October 2006, we developed a rails plugin that enables the “to_bool” conversion.

“to_bool” works on the strings “true” and “false” and any capitalisation of these, and on numbers, as well as on nil. Other strings raise an ArgumentError.

Examples are as follows:

'true'.to_bool #-> true
'TrUe'.to_bool #-> true
true.to_bool #-> true
1.to_bool #-> true
5.to_bool #-> true
-9.to_bool #-> true
nil.to_bool #-> false
'false'.to_bool #-> false
'FaLsE'.to_bool #-> false
false.to_bool #-> false
0.to_bool #-> false

You can find the plugin here as a tarball. To install it, simply decompress the to_bool directory into your vendor/plugins directory.

“Commercialising Video” conference in Sydney

On Tuesday 24th June I attended the “Commercialising Video” conference held in beautiful Jones Bay Wharf in Sydney’s harbour. AIMIA and Claudia Sagripanti from VentureOne organised it together.

It was a mixture of case studies and panels. The case studies were talks by successful digital media companies, including Sony, Bebo, Viocorp, Clear Light Digital and Fox Interactive Media (really: mySpaceTV). The panels constituted each a moderator and a small number of industry experts that briefly presented on their knowledge on a specific topic and then discussed this topic led by questions from the audience.

I thought the format was very successful and the conference covered a broad range of current topics of interest in digital media. Panel topics included:

  • mobile: challenges for getting video onto mobile and making a return on it
  • business models: how to make money from online video
  • sports video: what business models work with sports content
  • metrics: why we need to measure video and what and how
  • innovations: what innovative products are to be expected in the near future in video

I was one of the panellists on the metrics panel – my slides are here. The very last slide provides a very basic preview of the video metrics service that is in development at Vquence right now. Expect the final product to look much more professional, once I’ve included the awesome designs that we have just received from Chiz.

One thing that I took away from the conference is that the online video market is finally maturing and we are seeing business models that work. While they can roughly be classified into ad-supported, sponsored, and user-paid, there are many details that you have to take care of dependent on the service that you are providing. Ad-support can be inside the video e.g. in pre-roll, post-roll, mid-roll, overlay, or accompanying ads e.g. in dynamically loaded roll-outs, banners etc. Sponsorship is mostly used for non-profit sites. User-paid models are e.g. subscriptions, pay-per-view, pay-per-download. General video sites work not so well for ad-support as specialised sites. There is a lot of money for videos in specialised areas where your community is very keen to receive the latest video content fast, e.g. in sports.

In mobile in Australia, video business is still hard going, because the bandwidth costs are high, extra production cost is high, and because of challenges to get video into a usable form on such a small screen (e.g. soccer-ball is too small to be more than a pixel). This also means that the cost for consumers to get video is high, while the quality is still low. This obviously does not make for a very good market. The size of the iPhone screen, combined with the slow realisation by mobile phone providers that they have to drop prices for video transfers, may however totally change this situation.

Finally, I noticed that there was a large call for metrics. Measurement of the use of video and tracking the distribution of videos around the Internet, as well as measurement of advertising that relates to videos are all being requested to get more transparency into the business and mature the market. Initial services are available, in particular from existing Web Analytics and Internet Market Intelligence companies. However, the technology is new and we have a long way to go online and even more on mobile. This is a great opportunity for Vquence!

Thanks very much, Claudia, for organising this event and I hope there will be more to come in this space.

What is a proper “viral video”?

Many companies are intending to undertake viral video marketing campaigns.

This should come as no surprise, since video is undoubtedly the most effective content on the Web: “People are about twice as likely to play a video, or replay one that started automatically, than they are to click through standard JPG or GIF image ads.”

Even Techcrunch has a thing for dodgy viral video advertising approaches.

The definition of a “viral video” is however not quite clear.

Wikipedia defines “viral video” as “video clip content which gains widespread popularity through the process of Internet sharing, typically through email or IM messages, blogs and other media sharing websites.” This describes more the process through which viral videos are created rather then what a viral video actually is.

I tried to analyze the types of viral videos around to understand what a viral video really is. I found that there are three different types and would like to provide a list of descriptive features of each (leave a comment if you disagree with the types or want to suggest more).

The reason for this separation of types is that if you are a company and want to create a viral video advertising campaign, you need to decide what type of viral video you want to create and choose the appropriate approach and infrastructure to allow for that type of viral video to be successful.

Here are the three types of viral videos that I could distinguish:

popular video

A video that has a high view count (in the millions) – possibly emerged over a longer time frame – is viral because in order to get such a high view count, many people must have been told about it and been directed to go to it and watch it.

A prime example of such a video is the “Hahaha” video of a baby laughing, which is currently at position 10 of YouTube’s Most Viewed of All Time page. I would also put the “Evolution of Dance” video into this category, which alone on YouTube has seen over 81M page view and has therefore the top rank on the Most Viewed of All Time videos on YouTube. This video has some aspects that make it a cult, but I don’t think they are strong enough.

The features of videos in this category are as follows:

  • high page view count
  • not subject to fashion or short-term fads
  • interest for many audiences
  • hasn’t spawned an active community

The reason for the last feature is that a popular video is simply a video that is a “must see” for everybody, but it doesn’t instill in people an urge to “become involved”. This is a bit of black-and-white painting of course – see also how many people created copies of the “Evolution of Dance” – but it is a general feature that applies to most of the audience.

cult video

Videos that become “cult” are not necessarily videos that achieve the highest view counts. They will however achieve a high visibility and almost 100% coverage in a certain sub-community. Such videos are regarded as viral since they virally spread within their target community. Sometimes they even create a community – their fan club.

The main aim of these videos is not a high view count on a single video, but an active community that is highly motivated to have the video be part of their culture.

A typical example is the “Diet Coke and Mentos” phenomenon. I would not be able to point to a single video on this phenomenon but there is a whole cult that has emerged around it with people doing their own experiments, posting videos, discussing it on forums, helping each other on IM etc. There are even fan clubs on Facebook.

The features of videos in this category are as follows:

  • many videos have been created on the same topic, in particular UCG
  • often, it is not clear which was the originating video that started the phenomenon
  • there is a substantial view count on the individual videos
  • not subject to fashion or short-term fads
  • interest for a sub-community mostly
  • has spawned an active community, possibly with their own website

I would use the “Ask a Ninja” series of vodcasts as another example of a cult video. It has a central website and a very active community of fans around it.

trendy video

The term “Internet meme” has been coined for the videos in this category. They are essentially videos that create a high amount of activity around the Internet for a short time, but then people lose interest and move on. They are trendy for a limited amount of time.

A typical example in this category is the “Dramatic Chipmunk” with more than 7M views on YouTube on this one video, and further millions of views on the diverse mash-ups that were created. At one point, it was a “must see” and you had to have mashed it up to be “in”. Now it has been replaced by Rick Rolling – the activity of pointing people to a URL of something but then falsely directing them to Rick Astley’s video of “Never Gonna Give You Up” on YouTube with more than 9M page views.

The features of videos in this category are as follows:

  • videos achieve high page view in a short amount of time
  • audience interest vanishes after a limited time
  • often consists of funny, shocking, embarrassing, bizarre, or slanderous content
  • there is a substantial view count on the video(s) related to the phenomenon
  • creates high user activity for a short time e.g. through mash-ups, remixes, or parodies

Now that we have defined the different types of viral videos there are the lessons for viral video marketing campaigns.

If you want to create a popular video, create a beautiful, time-less video like the Sony Bravia Bunnies ad that everybody just has to have seen. Then make sure to release it on the Internet before you release it on TV by uploading to YouTube and a set of other social video hosting sites. Feel free to complement that with your own Website for the video. Start the viral spread through emailing your employees, friends, social networks, etc and rely on the cool-ness of the video to spread.

Typical Australian ads that have achieved popular video status are Carlton Draught’sBig Ad” and the more recent VBStubby Symphony” ad.

If you want to create a cult video, you should create something that will excite a sub-community and provide the opportunities for the community to emerge. Blendtec did this very well with their “Will it Blend?” videos and website. I actually believe, they should open that Website even further an allow discussion forums to emerge. They could pull all those blender communities at Facebook into their site. OTOH they could just be involved in the social networks that build elsewhere around their brand to make the most from their fan base.

If your video ad is however just meant to create a high audience activity for a short time, you might consider doing a shocking video like the one Unicef created with the Smurfs. Or something a little less extreme like the funny German Coastguard video created by the Berlitz Language Institute.

Standardisation in video advertising

It’s great to read at ClickZ that the Interactive Advertising Bureau (IAB) is preparing new format guidelines for video advertising. This includes pre-, mid- and post-roll, overlays, product placement, and companion ads (display ads placed alongside video).

The standard is currently in public comment phase, which closes on 2nd May 2008.

It is good to see that the standard also contains recommendations on the ratio of ad-to-content and on capping the frequency of ads to save the consumer from overly getting swamped with advertising.

The effect this standard will have on the video advertising industry will be enormous. Content publishers will build their websites with these standards in mind and provide generic advertising spaces into which they can then include advertising as required from the appropriate advertisers. Advertisers can create ads that will be re-usable across websites. And video advertising agencies can finally start to emerge that provide the market place for video ads to find their locations.

This is a sign that online video advertising is maturing and more generally that free online video distribution will become more viable for content owners.

For Vquence this is great news since all this new advertising will need to be measured for impact – I expect the need for video analytics will grow enormously. 🙂

Video Metrics: an emerging industry category

Yesterday, YouTube gave video metrics to their users. If you have uploaded videos to YouTube, you can go to your video list and click “About this video” to see a history of view counts. Very simple, but a good move.

It is great to see YouTube provide this service, even if just for your own, personally uploaded videos. It validates the newly emerging industry category of “online video metrics“, that Vquence is also a part of.

Our colleagues from VisibleMeasures expressed a similar feeling in their blog entry saying: “we view anything that companies can do to help showcase the need and improve the landscape for video measurement as a plus for the entire ecosystem”. I couldn’t express it any better.

Following the blogging community, there is a large need for online video metrics, both for tracking your own published videos – as YouTube has started providing since yesterday – as well as tracking videos published by the market generally for market analysis and intelligence reasons.

The number of players in the field is still small and FAIK we are the only Australians to offer these services.

U.S. spending on internet video advertising alone is expected to grow to US$4.3 billion by 2011. The need for online video publications is predicted to grow even stronger in the near future when each and every Website will be expected to use video to communicate their message. The need for video metrics will increase enormously.

Check out our new Website if you want to learn more about how Vquence measures video.

Choosing a Web charting library

At Vquence, until now we have used the Thumbstacks chart library for our graphs. TSChartlib is a simple open source charting library that uses the HTML canvas and an IE hack to create its graphs.

Vquence is now getting real serious with charts and graphs and we were thus looking for a more visually compelling and more flexible alternative. If you do a google search for “online charting library“, you get a massive amount of links to proprietary systems (admittedly, some of them offer the source code if you pay a premium). I will not be listing them here – go find them for yourself. However, the world of decent open source charting libraries is relatively small, so I want to share the outcome of my search.

There is the Open Flash Chart libary, which provides charting functionality for flash or flex programmers. The charts look rather nice and have overlays on the data points, which is something I missed thoroughly from TSChartlib.

There is a open source flex component called FlexLib which only does line charts, IIUC.

There is PlotKit, a Chart and Graph Plotting Library for Javascript. It has support for HTML Canvas and also SVG via Adobe SVG Viewer and native browser support and looks quite sexy.

Then there is JFreeChart, a 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. Another Java charting library is JOpenChart. Incidentally, there’s a whole swag of further Java libraries that do charts and graphing. However, we are not too keen on Java for Web technologies.

Outside our area of interest, there are also open source chart libraries in C#, but C#/.NET is not a platform we intend to support, so these were out of the question.

Our choice came down to the “Open Flash Chart” library vs “Plotkit”. Of the two, the Flash library and technology seems more mature, easier to use, and creates sexier charts. Also, we can sensibly expect all Vquence users to have Flash installed, while we cannot expect the same to be true for SVG. However, I was fascinated by the flexible use of SVG and HTML Canvas and will certainly get back to it later, when I expect it to have matured a bit more.

Our choice of the Open Flash Chart was further facilitated by a rails plugin for it. Score!

Of course: I might have totally missed some obvious or better alternatives. So, leave me a reply if you think I did!