Blog homepage

What I Discovered While Writing a Book on Audio for the New Realities

Interactive Audio / Spatial Audio / VR Experiences

Writing a book on any subject becomes somewhat of a journey. If you were not completely an expert on the subject before you started, you will probably be one by the time you finish. Choosing to be an expert on a topic such as spatial audio is a challenge at the best of times. It is a field that is complex and constantly evolving.

During my time working with Magic Leap, the Facebook Spatial Audio team, and Occulus, it became apparent to me that there are enough ways for spatial audio to contribute to an effective experience that those of us producing content are going to be discovering new possibilities for quite a few years. I love that we still don’t know exactly how everything works, as that uncertainty is what will lead us down interesting paths to creating things we may never have dreamed possible.


In writing a book on this topic, I undertook the task of creating a resource that would not only introduce people to various concepts and ideas behind new reality audio, but also provide some practical processes for producing interesting content. For spatial audio, these concepts are incredibly important. There are some basic ideas that are very different from traditional media production and without a strong foundation, spatial audio content can just collapse into an unconvincing mess.  While researching information for the book and arranging my thoughts, it became apparent that there was a common theme concerning the current state of spatial audio, which had a significant impact on my entire process. Above everything else within the book, this theme became central to the overall narrative of how to approach spatial audio design.

This idea did not exist in any form when work on this book started, and it was not even something I had considered; but, by the time I finished writing, it was a central idea in my book’s overall direction. In 2018, as the new realities are in their early stages of evolution, this factor affects the development and production of audio content and it is a massive conflict. It is the conflict between technology and creative processes. Typically, a conflict indicates that there is a problem or an obstacle that needs to be overcome; and I have realised that, right now, this conflict between technology and creative processes is critical when it comes to how spatial audio needs to evolve.

Our creative processes rely completely on the ever-evolving technology that allows us to simulate the real-world behaviour of sound. But, this technology also relies on an endless number of creative ideas, approaches, and interpretations, which can compensate for or hide the weaknesses in human hearing ability and technology limits.

Audio in the real world is complex, and although there has been significant research done into many aspects of audio behaviour, we still do not fully understand everything about how humans hear and specifically how they localise sounds in space around them. What we do know is that a human’s ability to accurately detect exactly where a sound is emanating from is quite flawed and not nearly as accurate as many other animals. We don’t know exactly everything about how humans hear, but we do know that in many ways our hearing is pretty primitive.

Within the book, I explore many of the basic approaches to how we interpret spatial audio. We need to understand exactly how a human processes information in relation to the content we create. We are essentially manipulating our perceptions when we create art, music, or film. So, understanding how a person may react to light, or colour, or sound lets us control and guide that manipulation.

From a technology point of view this creates some serious challenges when we want to create systems to accurately simulate sound behaviour. The technology around the areas we do understand is improving, almost on a daily basis. This is great, but applying that information is expensive in terms of computer power. So, just because our tech is capable of doing something, it does not mean that it is practical to use it in our applications and experiences. We kind of know what we should be doing, but we cannot always afford to implement that technology.

Psychoacoustics also plays a major part in creating spatial audio, as does how we have been conditioned by society. We have become accustomed to hearing certain things in certain ways, thanks to film, television, and traditional video game technologies. Because of this, regardless of how accurate a representation of reality might be possible, we have to account for how humans perceive what they hear.

And this is where the creative process comes in. With almost all new technological advances, there is the concern that it will replace the influence of artistic creators, who are responsible for much of the world’s entertainment media. Music sample libraries were going to put musicians out of work, film and TV would remove the need for stage actors, and photorealism in film and games would remove the need for artists to create assets….except, they didn’t, they haven’t, and they won’t. The great and wonderful conflict, I realised, is that technology is absolutely critical, essential, and brilliant at allowing us to create virtual worlds in which our audiences can experience truly mind-blowing narratives and environments, but the creative process of designing and bringing life to this content is probably more important than it has ever been.

This is not a simple desire to see artists as still relevant; it is a real need to combine the best aspects of both technology and creation. The art cannot be created without the technology, but the technology is not capable of creating accurate perceptions in humans on its own. The more I looked into the incredible new technologies being developed for new reality production, the more I realised that the old techniques of stagecraft and live performance are a critical aspect in the production of the next generation of media.

The layout of a symphony orchestra is something that has evolved over decades to account for the number and types of instruments within the ensemble, but also the position of those instruments relative to the assembled audience. This is something that has changed over time as new instruments were added to a typical orchestra and as the technology to build good acoustic spaces improved. Technology played a part in this process, but so did the instincts of the artists involved. When dealing with any aspect of sound or music production, your ears should always have the final say in what works best.

The New Realities will be no different. I have already started to experiment with how to arrange music around a central listening position in a spherical manner. HRTF technology allows me to position the listener as though standing in the very middle of an orchestra. We have seldom done this before because there was either no reason or no way to do so. But if we can place our audience completely inside a sphere of orchestral instruments, then how do we arrange those instruments around the listener? It is very unlikely to be the same arrangement as a stage layout, but how we position musicians within an ensemble on stage will likely influence this concept. I am sure a computer could design placement based on the optimal acoustics of each instrument, but I am also pretty sure that a human with the right set of ears could design something that other humans would always find preferable to a computer-only model.

The same applies to sound design and general audio creation and implementation. What we learnt from using coconut shells on the side of a stage to simulate a horse approaching from the distance may be directly relevant to creating an immersive engaging virtual experience to new reality audiences. So, within the book, I address how many of our oldest skills in sound design, theatre craft, and conveying ideas to an audience are not only still relevant, but perhaps as relevant as they have ever been.

I am incredibly excited at the potential of new reality content. We will be confronted with challenges and opportunities that have never existed before, and it is not often that an entirely new media format is created. So, right now, both technologists and artists have the potential to influence an entirely new format for communication that by all indications is going to become a significant part of how humans spend their time in the 21st century.


New Realities in Audio

A Practical Guide for VR, AR, MR & 360 Video

By: Stephan Schütze, Anna Irwin-Schütze



Read an extract from the book! A contribution by Martin Dufour, CTO Audiokinetic: 

The Interactive Audio Renaissance: Bringing Sound Back to Life After a Century of Baking it on Film 

Stephan Schütze

Spatial Audio Producer & Consultant

Sound Librarian

Stephan Schütze

Spatial Audio Producer & Consultant

Sound Librarian

Stephan Schütze has worked within game audio production for close to twenty years. He is a composer, sound designer, location recorder and spatial audio practitioner. The broad and varied list of audio production skills Stephan has developed over his career, and his experience working with some of the leading companies in New Reality technology provided him with the perfect opportunity to create the first book on audio production techniques for this newly evolving technology.



Anders Hjemdahl

June 03, 2018 at 06:41 pm

Stephan, I love what you wrote about the book. We're currently working on a series of sound-focused VR experiences, and it's truly a journey of discovery. Even though everybody in the team has been doing VR and spatial audio for close to four years, it's very much a creative process, where we learn as we go. I'll keep you posted, and look forward to reading your book! All the best, Anders

Jonas Foged Kristensen

June 30, 2019 at 01:59 pm

Looking forward to reading this! Sounds absolutely brilliant with the focus on an inclusive approach to audio I the new realities, not just throwing away a hundred years of practice in the old ones. Exciting!

Leave a Reply

Your email address will not be published.

More articles

Earthworms Trigger Audio Using Wwise : 'We - The Common Body'

Inspired by space, human's relationship with nature, and nature's relationship with technology, the...

10.10.2017 - By INFER Project

Fun with Feedback

Introduced with Wwise 2017.1, 3D busses and auxiliary sends from busses make it possible to use the...

17.10.2017 - By Nathan Harris

Integrated Automotive Audio Management

Based on advanced audio management solutions, like Wwise Automotive, Continental can now integrate...

18.9.2018 - By Konrad Hilarius & Jörg Witthaus

The Making of Krotos

Igniter Live Synth, the Krotos plug-in for real-world or sci-fi vehicle and engine sound effects,...

3.9.2019 - By Orfeas Boteas

Wwack-A-Mole | Creating a game inside of Wwise

Introduction A great way to learn a new tool is to try and break the rules of how to use it. You...

24.3.2022 - By Daniel Nielsen

Moods, Modes & Mythology | Using Wwise for Interactive Live Performance Piece "Classic Dark Words"

Hello, I’m Carlo, a music composer and sound designer for video games. Today, we’ll dive into using...

22.5.2024 - By Carlo Tuzza

More articles

Earthworms Trigger Audio Using Wwise : 'We - The Common Body'

Inspired by space, human's relationship with nature, and nature's relationship with technology, the...

Fun with Feedback

Introduced with Wwise 2017.1, 3D busses and auxiliary sends from busses make it possible to use the...

Integrated Automotive Audio Management

Based on advanced audio management solutions, like Wwise Automotive, Continental can now integrate...