In the last week, I have received many emails replying to my request for feedback on the video accessibility demo. Thanks very much to everyone who took the time.
Interestingly, I got very little feedback on the subtitles and textual audio annotation aspects of my demo, actually, even though that was the key aspect of my analysis. It’s my own fault, however, because I chose a good looking video player skin over an accessible one.
This is where I need to take a step back and explain about the status of HTML5 video and its general accessibility aspects. Some of this is a repetition of an email that I sent to the W3C WAI-XTECH mailing list.
Browser support of HTML5 video
The HTML5 video tag is still a rather new tag that has not been implemented in all browsers yet – and not all browsers support the Ogg Theora/Video codec that my demo uses. Only the latest Firefox 3.5 release will support my demo out of the box. For Chrome and Opera you will have to use the latest nightly build (which I am not even sure are publicly available). IE does not support it at all. For Safari/Webkit you will need the latest release and install the XiphQT quicktime component to provide support for the codec.
My recommendation is clearly to use Firefox 3.5 to try this demo.
Standardisation status of HTML5 video
The standardisation of the HTML5 video tag is still in process. Some of the attributes have not been validated through implementations, some of the use cases have not been turned into specifications, and most importantly to the topic of interest here, there have been very little experiments with accessibility around the HTML5 video tag.
Accessibility of video controls
Most of the comments that I received on my demo were concerned with the accessibility of the video controls.
In HTML5 video, there is a attribute called @controls. If it is available, the browser is expected to display default controls on top of the video. Here is what the current specification says:
“This user interface should include features to begin playback, pause playback, seek to an arbitrary position in the content (if the content supports arbitrary seeking), change the volume, and show the media content in manners more suitable to the user (e.g. full-screen video or in an independent resizable window).”
In Firefox 3.5, the controls attribute currently creates the following controls:
- play/pause button (toggles between the two)
- slider for current playback position and seeking (also displays how much of the video has currently been downloaded)
- duration display
- roll-over button for volume on/off and to display slider for volume
- FAIK fullscreen is not currently implemented
Further, the HTML5 specification prescribes that if the @controls attribute is not available, “user agents may provide controls to affect playback of the media resource (e.g. play, pause, seeking, and volume controls), but such features should not interfere with the page’s normal rendering. For example, such features could be exposed in the media element’s context menu.”
In Firefox 3.5, this has been implemented with a right-click context menu, which contains:
- play/pause toggle
- mute/unmute toggle
- show/hide controls toggle
When the controls are being displayed, there are keyboard shortcuts to control them:
- space bar toggles between play and pause
- left/right arrow winds video forward/back by 5 sec
- CTRL+left/right arrow winds video forward/back by 60sec
- HOME+left/right jumps to beginning/end of video
- when focused on the volume button, up/down arrow increases/decreases volume
As for exposure of these controls to screen readers, Mozilla implemented this in June, see Marco Zehe’s blog post on it. It implies having to use focus mode for now, so if you haven’t been able to use keyboard for controlling the video element yet, that may be the reason.
New video accessibility work
My work is actually meant to take video accessibility a step further and explore how to deal with what I call time-aligned text files for video and audio. For the purposes of accessibility, I am mainly concerned with subtitles, captions, and audio descriptions that come in textual form and should be read out by a screen reader or made available to braille devices.
I am exploring both, time-aligned text that comes within a video file, but also those that are available as external Web resources and are just associated to the video through HTML. It is this latter use case that my demo explored.
To create a nice looking demo, I used a skin for the video player that was developed by somebody else. Now, I didn’t pay attention to whether that skin was actually accessible and this is the source of most of the problems that have been mentioned to me thus far.
A new, simpler demo
I have now developed a new demo that uses the default player controls which should be accessible as described above. I
hope that the extra button that I implemented for the menu with all the text tracks is now accessible through a screen reader, too.
UPDATE: Note that there is currently a bug in Firefox that prevents tabbing to the video element from working. This will be possible in future.
I am still unable to see the menu which is supposed to come up when pressing spacebar or clicking the link saying “Access Subtitles, Captions and Audio Descriptions – press space bar”. I can see the section or div holding the list created in the itextMenu function but I am unable tosee more content that is supposed to be inside.
I am testing with Firefox 3.5.1 and NVDA trunk r3094. I am even unable to see this menu using NVDA’s object navigation.
Has anybody checked this with acc explorer, or some other tool capable of inspecting IA2 hierarchy?
Is this a problem on my end, or problem with one of the tools in the chain namely, script, Firefox or NVDA?
Oops, I am sorry. I can’t say really what has changed but now I can see it all fine. I have just restarted all the applications involved, reloaded the page and Voila. It’s now nicely showing all the options for me.
I am getting even more impressed.
Perhaps menu can be further tweaked using ARIA as explained and further shown in the example here: http://www.w3.org/TR/2008/WD-wai-aria-practices-20080204/#Menu . Or do you think this is not appropriate for this kind of content? Or will it be possible to merge this whole navigation with the existing player controls in the future?
Hello again,
Okay, after some more playing I am afraid it needs some tweaking because controlling via keyboard does not seem to be reliable. Sometimes it shows the menu and sometimes it does not.
When I am emulating the mouse, the menu always shows fine. When testing keyboard controls I have tried with NVDA set to either focus or browse mode. So hopefully it’s not NVDA blocking the controls.
@pvagner – thanks for all the testing and feedback! Any patches you or anybody would have to improve accessibility would be happily accepted.
Unfortunately, the menu is not built using the menu element, so I cannot use the linked WAI-ARIA recommendations. The menu is instead built using ul, li and CSS, so I might need to think about using YUI3 (see http://www.yuiblog.com/blog/2009/08/03/aria-made-easier-with-yui-3/) or at least the same methods that YUI3 uses.
@Silvia I think Aria is just suited for that. If I do understand correctly it is good to implement custom keyboard navigation and program custom exposure of control’s roles and states to the assistive technologies.
This is how I do understand it so I apologize if this is not accurate enough. I’ll try to experiment with it more so I’ll get better understanding.
Silvia,
Does any of this accessibility work cover second audio languages or descriptive audio tracks? Would be interested if you
John,
Anything that is actually an audio or video track (let’s not forget about sign language videos) should be done as a synchronised track from within the binary video file, IMO. Thus, a Ogg video would need to contain multiple audio tracks for all the different languages, multiple descriptive audio tracks in different languages, and also sign language (possibly in different sign languages) to be fully accessible.
The idea that we have been toying with is to have all these different tracks available on the server and deliver a custom Ogg file depending on the browser request. I.e. if you’re asking for a particular language, you would get all the audio tracks that refer to that language together with the video; or if you’re asking for content for a blind user with their particular preferred Sign Language specified, you’d get no audio tracks and just video.
There are no implementations of this yet, but I would be curious to get your feedback. This is indeed what I want to experiment with for video accessibility in the near future.