We’re limited to an ear on each side of our head, yet we’re able to detect sounds from all around us. The fact that this is possible really is an amazing anatomical feat. After learning more about HRTFs, your ability to localize sound in three dimensions will make a lot more sense.
HRTF stands for Head-Related Transfer Function, and it describes the impact that the listener's anatomy has on the sound from any given location. Everyone’s ears are different, and hence everyone has their own unique set of HRTFs – you can think of your HRTF as your acoustic fingerprint.
In order to locate the direction of a sound, The HRTF can be broken down into three main components:
Arrival time differences of a sound at each ear (Interaural Time Differences)
Level differences of a sound at each ear (Interaural Level Differences)
Spectral cues from interactions with one’s anatomy
HRTF and Interaural Time differences (ITD):
When a sound approaches a listener it must travel a path to each of the listener’s ears. Depending on the distance of a sound source from each ear, a sound may have to travel different time/distance lengths to each year, and as a result, it reaches each ear at a slightly different time. This difference in a sound’s arrival time at each ear is an important cue in determining the direction of a sound source.
Imagine a dog 45 degrees to your right barking at you (we’ll name the dog Biscuit). The sound waves propagating out of Biscuit's mouth will eventually reach your left and right ear and make their way to your eardrums. In this scenario, it will take slightly longer for Biscuit’s bark to make it to your left ear since it’s further away from Biscuit’s mouth (the sound source). You won’t consciously perceive this delay between your right and left ear, but it’s an important factor that your brain uses to localize sounds.
HRTF and Interaural Level Differences (ILD):
Our brains are great at picking up differences in sound levels between each ear. For the most part, these differences are caused by that thing sitting right in between both ears – your head. Your head will block different frequencies within the sound waves traveling towards whichever ear is further away from a sound source. This phenomenon is known as acoustic shadowing (or head shadowing in this context).
Let’s imagine that Biscuit is barking at us at a 45 degree angle to our right again.
Your right ear will have sound traveling directly through the air to your right ear, but sound traveling to your left ear will have to travel both through and around your head in order to reach your left ear. Your head being in the pathway of the sound to your left ear will shadow sound traveling to your left ear. This acoustic shadowing has more effect on higher frequencies than lower frequencies.
The Problem with ITD and ILD
Arrival Time and Level differences are a key part in how you localize sounds but aren’t enough to accurately tell where all sounds are coming from. Imagine Biscuit is on your right, barking 10 feet directly in front of you compared to her barking 10 feet directly behind you. The arrival time and intensity differences of her bark would be the same in both scenarios as she is the same distance in both scenarios (so her bark would take the same amount of time and have the same Interaural Time Difference). Her bark would also have the same level distance as the angle of her bark would mean that the sound is traveling through very similar thicknesses of your head. These two locations lie on what scientists call the ‘cone of confusion’ - which is a cone-shaped zone that extends either side of your ear where the ITD and ILD are the same - this is where the Spectral Cues from your anatomy assist in localization.
HRTF and Spectral cues from your anatomy
When a sound wave approaches you, it interacts with your body before entering your ear canal. Most notably, your pinna (outer ear) (which we already have discussed the impact of), head (that contributes to your ITD/ILD, and torso). All these elements impact the frequency distribution of sounds entering each ear depending on the position of a sound source (we typically call these spectral cues). These spectral cues are important for localizing sounds, especially when a sound is coming from a position where time and level differences alone don’t provide enough information about its location.
The Importance of Personalization
Every individual has their own HRTF and hears the world uniquely. Incorporating individualized HRTFs into audio playback hardware is crucial for a realistic and immersive experience. The closer that the HRTF used in any recording or processing is to your individual HRTF, the better your localization ability, and the more accurate your sense of space will be. This requires a customization step, and technology that can use customized data.
Most software processing can utilise only generic HRTF processing, leaving a lot to be desired in the resulting experience of the listener. As an example, we’ve selected a video that plays game audio in the Unreal Engine in regular stereo sound first, and then generic binaural audio as well.
In the demonstration, the listener is able to more accurately localise sound when using generic binaural audio over conventional stereo output. However, you may have noticed that even in the more accurate generic binaural mode, sound in front and behind the head was almost indistinguishable, but sound to the left and right was fairly accurate. This is because generic binaural audio does not account for the way your own ears receive sound—generic binaural audio is somewhat like wearing someone else’s ears. To experience fully immersive and accurate audio, personal HRTF calibration must be accounted for, otherwise audio cannot be perfectly localized by the listener.
By exploring what HRTF means, we’ve learned how the human brain calculates where Biscuit is barking—for all possible locations she could be in. Interaural time differences, Interaural level differences, and spectral cues are the key to localizing sounds and what make up your unique HRTF. The way you hear the world is defined by your HRTF, and at OSSIC, we are dedicated to bringing personalized HRTF calibration to our headphones, so that the end user is able to experience sound just as they are in the real world. Stay tuned for our upcoming posts where we’ll dive a little deeper into other HRTF concepts.