Listen Up! Binaural Sound
Lately there has been a bit of a buzz about binaural sound. Early on Saturday morning, Rob da Bank broadcast a 3DÌýHeadphone SpecialÌýon ³ÉÈË¿ìÊÖ Radio 1 (see the trailer below). In this show he used binaural microphones to capture and share his journey from the Isle of Wight to Broadcasting House. There was also a Maida Vale session with Lucy Rose, recorded using a dummy head microphone (known as Fritz to his friends), which can be seen in these videos. I thought I'd take this opportunity to share what we know about this subject and update you on some of our own work in this area.
In order to see this content you need to have both Javascript enabled and Flash Installed. Visit ³ÉÈË¿ìÊÖ Webwise for full instructions. If you're reading via RSS, you'll need to visit the blog to access this content.
Ìý
So what is binaural sound? Those of you who read this blog regularly may already be familiar with the concept, as in the last couple of years we've discussed some binaural experiments we've done here. Briefly it is a sound production technique that mimics the natural hearing cues created by our head and ears to create the impression of 3D sound when listening on headphones. This is as opposed to listening to stereo sound on headphones, as we currently do, which leads to the impression that sounds are all inside your head. This FAQ page from one of our previous experiments gives a bit more detail about binaural.
Binaural sound itself is not a new idea, public performances were transmitted in some form of binaural stereo as far back at . In the early 1970s dummy head microphones, like Fritz and the one pictured below, became commercially available. These are basically human manikins with microphones placed where the eardrums should be. The ³ÉÈË¿ìÊÖ made several pioneering radio dramas using these microphones at the time. Most notably The Revenge made in 1978. It was a 20 minute play written and performed by Andrew Sachs without a single word of dialogue. This programme is often played on ³ÉÈË¿ìÊÖ Radio 4 Extra, so look out for upcoming repeats. Since then there have been occasional experiments with these techniques in ³ÉÈË¿ìÊÖ programmes. In 2008 the Radio 4 documentary Bravo November, about a Chinook helicopter in the Falklands War, used dummy head recordings from inside the helicopter. The brilliantly imaginative interactive drama The Dark House, broadcast in 2002, used miniature microphones placed in the ears of the actors, so you could listen from each of their perspectives.
Ìý
So why aren't all ³ÉÈË¿ìÊÖ programmes available in binaural sound? These recording techniques can create great immersive effects when the right demonstrations are used. Rob da Bank showed some of the classic examples such as the Ìýwhich can be quite spooky to listen to. But often the quality of binaural recordings is not yet good enough. It is very difficult to create the impression of sounds coming from in front of your head, partly because of conflicting visual information and partly because of a mismatch between the shape of the dummy head and the shape of your own, which means that the auditory cues are not perfect. This mismatch of cues also affects the tonal quality of binaural recordings.
There are also practical issues with dummy head recording. They are not the most portable or inconspicuous of microphones. Another issue is that binaural sound cannot be reproduced over loudspeakers without special processing techniques, which are still developing. In 2011 there was a feature on the Ìýabout a scientist at Princeton University who is one of the many people working in this area. (That piece inspired Today's that year, where Evan Davies tried to get the nation to put their hands in front of their face to block out a noise signal, much to our amusement.) Thanks to Internet distribution of programmes, it is now more feasible that different versions of a programme could be delivered, depending on whether headphones or loudspeakers are being used. It makes sense to develop production techniques that are independent of the reproduction method.
Our more recent experiments with binaural sound have used a different technique which I'll call headphone surround. This takes surround sound material produced for loudspeaker playback and using binaural techniques creates virtual loudspeakers around the listener. Instead of playing each programme over loudspeakers and recording this using a dummy head, a set of measurements (known as HRTFs) can be made and used in software to create headphone surround from any 5.1 audio signal.
Headphone surround has the advantage that existing programmes with 5.1 audio could be listened to with headphones without losing the surround sound impression. We have previously shared a radio dramaÌýand a carol concertÌýthat were made using this technique. Radio France have just launched a new website called , which hosts a collection of programmes that contain innovative sound. Much of the content is offered in a binaural format, also made in this way. But why stop there? The use of HRTFs allows you to create a sound source in any direction, so it is feasible to go beyond established loudspeaker formats and create a rich 3D scene. We have previously discussed the concept of object-based audio, which potentially allows any reproduction method including 3D binaural sound.
There are still many open problems with binaural sound though. I've touched on the issue of incorrect cues caused by differing head and ear shapes between the recording and listener. If the HRTFs for the listener are known then binaural sound can be created specifically for them. This is possible in a research lab now and one day may be available for everyone through projects such as . Work is underway to develop a standard file format for this kind of data.
Another issue is head movement. For a convincing impression of sounds coming from outside of the head, they should not move with your head but should stay fixed in place. This is only possible if a system can monitor your head movement and compensate for it dynamically. Again this is easily done in a lab these days but we do not yet have widely available technology to do this for ³ÉÈË¿ìÊÖ audiences.
There have already been a few mobile apps that have used binaural sound. Recently I worked with Neil Cullen, an MSc student at the University of Salford, to build a prototype that creates dynamic headphone surround on a mobile (see photo). It used a relatively cheap Bluetooth that attached to the headphones. The quality and affordability of mobile technology for audio processing and head tracking will improve over the next few years, so this may be feasible in the future.
Ìý
Ìý
When head tracking is used, a dummy head recording no longer adequate. In Beck's recent reinterpretation of David Bowie's Sound and Vision, which last week was released Ìýwith interactive 360 degreeÌývideo and binaural audio, a nightmarish head with Ìýwas created to give four different perspectives depending on the chosen orientation of the video. Other approaches exist using microphone arrays and applying HRTFs to the captured signals, allowing free head rotation and individual HRTFs to be used. This is a hot area of research at the moment.
Last summer we evaluated a range of headphone surround systems in a listening test. Participants listened to a range of ³ÉÈË¿ìÊÖ programme clips on headphones, comparing these systems to a stereo down-mix of the 5.1 soundtrack, which is what we currently hear, and graded the sound quality of each. The test included a system that used individual HRTF measurements of the listener and dynamic head tracking. The results, which will be presented at the next , showed that there are more improvements to be made before high quality binaural broadcasting is possible.
With colleagues in the EBU we are organising a on binaural audio in May, where we hope to discuss the opportunities and challenges with the broadcast community. Meanwhile we will continue to explore the potential of this technology to create immersive headphone sound for ³ÉÈË¿ìÊÖ programmes.
Comment number 1.
At 4th Mar 2013, Andy Finney wrote:Interesting to see (hear?) this being discussed again. I've found that any spaced mics, especially with a baffle between them, can give a binaural effect although accuracy can suffer of course. Mono compatibility can also be compromised with binaural (once a big concern in radio ... it is less so now?) due to the subtle phase relationships in the sound.
Back in the days of quad, JVC had a system that used frequency-related time delays to 'translate' binaural recordings into something you could play on speakers. The sound field gave good positioning and seemed to extend beyond the speaker width. There was a four-channel demo of this which was the only quad playback that made me actually look behind me.
Complain about this comment (Comment number 1)
Comment number 2.
At 17th Mar 2013, dallas simpson wrote:RE:"But often the quality of binaural recordings is not yet good enough. It is very difficult to create the impression of sounds coming from in front of your head, partly because of conflicting visual information and partly because of a mismatch between the shape of the dummy head and the shape of your own, which means that the auditory cues are not perfect. This mismatch of cues also affects the tonal quality of binaural recordings."
Surely a bit of creative thinking is appropriate here. The centre front weak spot in binaural recordings is well known - so place important sound images off centre. (I remember instructions to presenters on 405 line TV about not wearing complex striped patterns on their jackets because of moire patterns that were created due to the limitations of the 405 line system. Didn't stop TV being used and developed though despite it was technically flawed in a number of ways) A bigger issue is front / rear ambiguity and reversal, and again this can be minimised with moving sound images, which, having spatial trajectories, are easier to track. generally centre front images are stronger if the head to source distance is higher, say 1.5 metres or more infront of the binaural microphones, and in a near anechoic environment i.e outdoors. There are other improvement techniques that can be used in addition. With regard to tonal qualities, simply EQ as appropriate with variable q parametrics.
Frankly, as a creator of binaural content for over 15 years, I really see no reason why the ³ÉÈË¿ìÊÖ can't produce more binaural content as Rob da Bank on Radio 1 has done, and as other binaural programmes have already been produced in the past, why limit their creation now? The Dark House used in-ear mics on the performing artists, with all the problems of varying incompatibility with listener's heads, lack of speaker and mono compatibility, yet was highly acclaimed. Hey that's binaural, get over it. There was a strong positive listener reaction to the Rob de Bank binaural special despite all the inherent technical issues on the binaural quality and the fact that the live (allegedly binaural) Maida Vale session with Lucy Rose was actually more stereo than binaural (just go and listen to it!).
Why is there such a hang up about mono and speaker compatibility, and other issues, with binaural material? Of course its production costs. Its far more expensive to create two audio productions - speaker stereo / surround and headphone binaural surround and wouldn't it be great if we could just create speaker productions and binauralise them at the touch of a button? This thinking totally misses the point that our binaural hearing is 'with ears', demanding a production process facilitating a recreation of our full hearing potential, spherical periphonic surround sound as we perceive the world around us. NOT a binaural conversion of a virtual listener in someone's living room listening to their 5.1 speaker system!
The ³ÉÈË¿ìÊÖ research facility seems to be delaying any serious development of binaural content across the network by insisting that this remarkable technque, with all its quirks and inadequacies, cannot be more widely used unless and until the ³ÉÈË¿ìÊÖ Research has nailed all the problems it perceives and has spent years of R&D perfecting a perfect system.
Meanwhile those of us who are not subject to these uneccesary restrictions are getting on with the job of producing and creating highly spatial binaural content for listeners to enjoy. After all speaker surround sound recording and reproduction is not technically fully perfected and 5.1 presentation is far removed from our own binaural periphonic hearing, even Ambisonics is a work in progress. But it didn't stop these formats from being rolled out for listeners to enjoy. And full credit to Rob de Bank for promoting some binaural content on ³ÉÈË¿ìÊÖ Radio 1.
So for heaven's sake lets not get paranoically restrictive by throwing out the baby with the bathwater and have more headphone surround sound binaural content across the board for people to enjoy in this headphone age and address the technical details as we go along.
Complain about this comment (Comment number 2)
Comment number 3.
At 18th Mar 2013, Michelle Summers wrote:Chris, thanks for a fascinating article... had not heard about binaural sound before. Have come across quite a bit of content on technology but mostly stuff that relates to form factor of devices and interactivity. Getting something like this right and integrated into regular broadcasting I believe would make for an incredible user experience. Will definitely listen out for The Revenge when it's next on... I'm curious to hear what it's like!
Complain about this comment (Comment number 3)
Comment number 4.
At 18th Mar 2013, Meryl Coach wrote:@Michelle Summers - have heard the play a couple of times already and it's definitely worth it for the experience... Andrew Sachs is brilliant and to be honest it's tough to believe that the show was produced in 1978.
Complain about this comment (Comment number 4)
Comment number 5.
At 18th Mar 2013, Toonman wrote:Chris, are there any ³ÉÈË¿ìÊÖ samples that we could download or listen to online... I'm in South Africa at the moment so don't have access to the broadcasted content and am quite interested to hear how it sounds.
Complain about this comment (Comment number 5)
Comment number 6.
At 22nd Mar 2013, Chris Pike wrote:@Toonman: You can still listen to several different versions of the Nine Lessons and Carols concert from 2011 on Rupert Brun's blog post: /blogs/radio3/2011/12/the-festival-of-nine-lessons-and-carols-in-surround-sound.shtml
These versions were made using virtual surround sound techniques from an original mix made for loudspeakers, rather than a dummy head recording.
@dallas simpson: Our aim is to create high quality content for ³ÉÈË¿ìÊÖ audiences. We are working to improve sound quality using these techniques, as we also see the great potential of binaural sound. I agree that virtual 5.1 processing does not make best use of these techniques, I hope that my article made that clear.
@Micelle Summers: Thanks. I'm glad that you enjoyed the article.
Complain about this comment (Comment number 6)
Comment number 7.
At 22nd Mar 2013, Chris Pike wrote:Apologies for the typo there Michelle.
Complain about this comment (Comment number 7)
Comment number 8.
At 13th Apr 2013, DWareing wrote:I was disappointed to find that Maida Vale session with Lucy Rose, touted as "binaural", sounded to me like vocals were recorded in mono.
Complain about this comment (Comment number 8)
Comment number 9.
At 13th Apr 2013, DWareing wrote:At 14:53 17th Mar 2013, dallas simpson wrote:
"
Surely a bit of creative thinking is appropriate here. The centre front weak spot in binaural recordings is well known - so place important sound images off centre."
I think there is probably too much creative thinking employed to explain why binaural tends to fall short of expectations. The simple explanation is that we are not reproducing what is required accurately enough. IMO subtle (broadband) cues around 3.5kHz are involved in front/back discrimination. This is in the region where the ear is most sensitive, and also where ear canal and pinna resonances tend to obscure more subtle spectral differences with varying interaural correlation. Luckily colouration involved is low compared to the more obvious (and somewhat useless) peaks and troughs above 8kHz that vary between individuals
Complain about this comment (Comment number 9)
Comment number 10.
At 3rd Jun 2013, bokajj wrote:We have been working on a new headset with head tracker for bi-neural sound playback www.intelligentheadset.com.
I think there are two interesting formats for bi-neural audio. One is playing back a bi-neural recording, what you can do in almost any stereo headset. Here you miss the orientation and is often subject to front/back confusion leaving you in doubt if the sound is in front of you or behind you. The other is created bi-neural audio for playback using head tracker. Here you define sound sources by their position, direction and other features in relation to you the listener.
I had my doubts in the beginning, but I must say it is working surprisingly well, thanks to our supplier of bi-neural audio engine and you are able to pinpoint the direction of the sound sources very precisely.
Complain about this comment (Comment number 10)
Comment number 11.
At 26th Jun 2013, DWareing wrote:Here is an example of binaural that is stable wrt the ground (not head) frame of reference with head movement. It does rotate with head 'yaw' but has 'inertia' (for me at least..)
Complain about this comment (Comment number 11)