All posts by silvia

1999: Scene Determination based on Video and Audio Features

Silvia Pfeiffer and Rainer Lienhart and Wolfgang Effelsberg, “Scene Determination based on Video and Audio Features” 1, Proceedings of the IEEE Conference on Multimedia Computing & Systems, June 1999, Florence, Italy pp. 9685-9690.

Abstract

Determination of scenes from a video is a challenging task. When asking humans for it, results will be inconsistent since the term scene is not precisely defined. It leaves it up to each human to set shared attributes which inte grate shots to scenes. However, consistent results can be found for certain basic attributes like dialogs, same settings and continuing sounds. We have therefore developed a scene determination scheme which clusters shots based on detected dialogs, same settings and similar audio. Our experimental results show that automatic deter mination of these types of scenes can be performed reliably.

Preversion
SpringerLink
Full Text from Springer

1999: Eine neue objektorientierte Analysetechnik fuer die Entwicklung von Audioanalyse-Algorithmen

P. Tomczyk and S. Pfeiffer and W. Effelsberg, “Eine neue objektorientierte Analysetechnik fuer die Entwicklung von Audioanalyse-Algorithmen (A new object oriented analysis technique for the development of audio analysis algorithms)”, TR-99-005 1999 Mannheim, Germany.

Abstract

Die Entwicklung von Audioanalysealgorithmen ist eine klassische, funktionsorientierte Aufgabe. Trotzdem ist es sinnvoll, solche Algorithmen mit einer objektorientierten Programmiersprache wie C++ zu implementieren, insbesondere um die Wiederverwendung von Routinen durch andere Projektteilnehmer zu erleichtern. Eine klassische objektorientierte Analyse z. B. nach UMT n

1999: The importance of perceptive adaptation of sound features in audio content processing

Silvia Pfeiffer, “The importance of perceptive adaptation of sound features in audio content processing”, SPIE Storage and Retrieval for Image and Video Databases VII January 1999 San Jose, CA, USA pp. 328-337.

Abstract

In analyzing audio material for features useful for extractingcontent, we must consider the value gained by adapting our analysis algorithms to the analysis processes of the human ear. This aspect with regard to loudness features is thoroughly examined in this paper. The increase in correlation to be gained by such cognitive processing is about 10%.

Preversion

1997: Video Abstracting

Rainer Lienhart and Silvia Pfeiffer and Wolfgang Effelsberg “Video Abstracting”, Communications of the ACM 40, 12, December 1997 pp. 54-62.

Abstract

This article discusses the usefulness of video abstracts as a sequence of video clips, similar to video trailers, and introduces algorithms to automatically create video abstracts from existing videos.

Preversion

Download from ACM Digital Library:

frames are not supported

1997: Automatic Movie Abstracting

Rainer Lienhart and Silvia Pfeiffer and Stephan Fischer “Automatic Movie Abstracting”, TR-97-003, 1997, Mannheim, Germany pp. 14.

Abstract

Presented is an algorithm for automatic production of a video abstract of a feature film, similar to a movietrailer. It selects clips from the original movie based on detection of special events like dialogs, shots, explosions and text occurrences, and on general action indicators applied to scenes. These clips are then assembled to form a video trailer using a model of editing. Additional clips, audio pieces, images and text, which are also retrieved from the original video for their content, are added to produce a multimedia abstract. The collection of multime dia objects is presented on an HTML-page.

Download

1996: Abstracting Digital Movies

Silvia Pfeiffer and Rainer Lienhart and Stephan Fischer and Wolfgang Effelsberg, “Abstracting Digital Movies”, Journal of Visual Communication and Image Representation 7, 4 December 1996 pp. 345-353.

Download: http://www.sciencedirect.com/science?_ob=MImg&_imagekey=B6WMK-45N5033-5-1&_cdi=6937&_orig=browse&_coverDate=12%2F31%2F1996&_sk=999929995&view=c&wchp=dGLbVtz-zSkWW&_acct=C000050221&_version=1&_userid=10&md5=8c74c65f470f9761450df959c9d1a297&ie=f.pdf
Preversion: http://www.informatik.uni-mannheim.de/techberichte/html/TR-96-005.html

Abstract

Large video-on-demand databases consisting of thousands of digital movies are not easy to handle: the user must have an attractive means of retrieving his movie of choice. For analog video, movie trailers are produced to allow a quick preview and perhaps stimulate possible buyers. This paper presents techniques for automatically producing such movie abstracts of digital videos.

1996: Automatic Audio Content Analysis

Silvia Pfeiffer and Stephan Fischer and Wolfgang Effelsberg “Automatic Audio Content Analysis”, Proceedings of the ACM Multimedia Conference 1996 November 1996 Boston, USA pp. 21-30.

Download: http://portal.acm.org/ft_gateway.cfm?id=244139&type=pdf&coll=Portal&dl=ACM&CFID=28002136&CFTOKEN=69856697
Preversion: http://www.informatik.uni-mannheim.de/techberichte/html/TR-96-008.html

Abstract

This paper describes the theoretic framework and applications of automatic audio content analysis. After explaining the basic properties of audio analysis, we present a toolbox being the basis for the development of audio analysis algorithms. We also describe new applications which can be developed using the toolset, among them music indexing and retrieval as well as violence detection in the sound track of videos.

Download from ACM Digital Library:

ACM DL Author-ize serviceAutomatic audio content analysis

Silvia Pfeiffer, Stephan Fischer, Wolfgang Effelsberg
MULTIMEDIA ’96 Proceedings of the fourth ACM international conference on Multimedia, 1997

frames are not supported

1996: The MoCA Workbench: Support for Creativity in Movie Content Analysis

Rainer Lienhart and Silvia Pfeiffer and Wolfgang Effelsberg, “The MoCA Workbench: Support for Creativity in Movie Content Analysis” Proceedings of the IEEE Conference on Multimedia Computing & Systems 1996 June 1996 Hiroshima, Japan pp. 314-321.

Preversion: http://www.informatik.uni-mannheim.de/techberichte/html/TR-95-034.html

Abstract

Semantic access to the content of a video is highly desirable for multimedia content retrieval. Automatic extraction of semantics requires content analysis algorithms. Our MoCA (Movie Content Analysis) project provides an interactive workbench supporting the researcher in the development of new movie content analysis algorithms. The workbench offers data management facilities for large amounts of video/audio data and derived parameters. It also provides an easy-to-use interface for a free combination of basic operators into more sophisticated operators. We can combine results from video track and audio track analysis. The MoCA Workbench shields the researcher from technical details and provides advanced visualization capabilities, allowing attention to focus on the development of new algorithms. The paper presents the design and implementation of the MoCA Workbench and reports practical experience.

Metavidwiki gone public

The revolution is here and now! If you thought you’ve seen it all with video web technology, think again.

Michael Dale and Aphid (Abram Stern) have published a plugin for Mediawiki called Metavidwiki which is simply breathtaking.

It provides all of the following features:

  • wiki-style timed annotations including links to other resources
  • a cool navigation interface for video to annotated clips
  • plain text search for keywords in the annotations
  • search result display of video segments related to the keywords with inline video playback
  • semantic search using speaker and other structured information
  • embedding of full video or select clips out of videos into e.g. blogs
  • web authoring of mashups of select clips from diverse videos
  • embedding of these mashups (represented as xspf playlists)
  • works with Miro through providing media RSS feeds

Try it out and be amazed! It should work in any browser – provide feedback to Michael if you discover any issues.

All of Metavidwiki is built using open standards, open APIs, and open source software. This give us a taste of how far we can take open media technology and how much of a difference it will make to Web Video in comparison to today’s mostly proprietary and non-interoperable Web video applications.

The open source software that Metavidwiki uses is very diverse. It builds on Wikipedia’s Mediawiki, the Xiph Ogg Theora and Vorbis codecs, a standard LAMP stack and AJAX, the Annodex apache server extension mod_annodex, and is capable of providing the annotations as CMML, ROE, or RSS. Client-side it uses the capabilities of your specific Web browser: should you run the latest Firefox with Ogg Theora/Vorbis support compiled in, it will make use of this special capability. Should you have a vlc browser plugin installed, it will make use of that to decode Ogg Theora/Vorbis. The fallback is the java cortado player for Ogg Theora/Vorbis.

Now just imagine for a minute the type of applications that we will be able to build with open video APIs and interchangable video annotation formats, as well as direct addressing of temporal and spatial fragments of media across sites. Finally, video and audio will be able to become a key part in the picture of a semantic Web that Tim Berners-Lee is painting – a picture of open and machine-readable information about any and all information on the Web. We certainly live in exciting times!!!

The nature of CMML

Today, for the millionth time I had to listen to a statement that goes along the following lines: “CMML technology is not ideal for media annotations, because the metadata is embedded with the object rather than separate”.

For once and all let me shout it out: THIS IS UTTER BULLSHIT!

I am so sick of hearing this statement from people who criticise CMML from a position of complete lack of understanding. So, let me put it straight.

While it is true that CMML has the potential to be multiplexed as a form of timed text inside a media file, the true nature of CMML is that it is versatile and by no means restricted to this representation.

In fact, the specification document for CMML is quite clearly a specification of a XML document. CMML is in that respect more like RSS than a timed text format.

Further, I’ll let you in on a little secret: CMML can be stored in databases. Yes!! In fact, CMMLWiki, one of the first online media applications that were implemented using Annodex, uses a mysql database to store CMML data. The format in which it can be extracted depends on your needs: you can get out single field content, you can put it in an interchangeable XML file (called CMML), or you can multiplex it with the media data into an Annodex file.

The flexibility of CMML is it’s beauty! It was carefully designed to allow it to easily transform between these different representations. It’s powerful because it can easily appear in all these different formats. By no means is this “not ideal”.