Blog homepage

Yonder: The Cloud Catcher Chronicles - An Episodic Audio Journal - Episode Two: 40 Species of Noise

Game Audio / Sound Design

I love open world games and I find myself observing the environments in detail as I travel around exploring. But, I often find myself thinking about how they compare to the real world. I spent three years living in Japan; one in Tokyo and two in Aomori prefecture, and one of the main things I noticed in Japan was how the environments changed both due to seasonal changes and location.

Spring and summer times are vibrant seasons in Japan; birds are very active, but what is really noticeable from a sound point of view during these seasons is insects. Millions of cicadas during the day and crickets during the night saturate the Japanese landscape with their song. What many travellers to Japan do not realize is that the various species have very specific song types as well as very specific locations in which they live. This means if you travel from southern Japan up to Tokyo and then north to Tohoku, you will be able to hear different insect sounds. And the insect sounds will also change from season to season. Once I realized this, I was able to watch a Japanese movie and identify where it was filmed and what time of year it was, if there were insect sounds.

We often ask our audience to spend hours inside our game worlds, and we design complex terrain with various biome types. Yet, sometimes we craft the audio for these environments with much less detail than the visuals. I wanted to not only add more life to the world of Gemea, but to create a living world that would highlight the changes of day and night, summer and winter, sunshine and storm.


Yonder: The Cloud Catcher Chronicles includes sounds of over 40 species of birds, insects, and frogs. Did I go overboard with this? Perhaps. But the result is a world that has a dynamic ecology which reflects various states of change. Each choice and its implementation is also designed to support the overall narrative of exploring the land of Gemea.

As a starting point, I wanted to select as broad a range of content as possible. I have a massive SFX collection as well as my own personal recordings. I have also recently been working on restoring some older recordings and making them suitable for game implementation purposes. Working through sound recordings is like selecting which instruments you want to compose for. Each sound has character, texture, and a quality that suits certain emotional states. Selecting the most suitable cicada sound is no less important than choosing which wind instrument to write a solo for.

I had created a first pass of the dynamic environment system months ago, and it worked and sounded effective in the game from a technical standpoint. These were a series of basic bird and insect sounds that functioned appropriately for the game world. I will go into the details of the system in a minute, but the selection of source content is critical, so I want to focus on that for a second. Putting in placeholder sounds meant the system could be tested. As the world grew and the artwork became more finalized, I was able to spend many hours in the world. Most of this time was spent implementing other sounds or performing constant ongoing mix balancing, but sometimes I would just play and explore; I would listen and I would “feel”.

When it came to selecting the second pass content, I knew the biomes very well and I knew what I wanted. I would go through my raw SFX material, often not looking at the names, and I would listen with an open mind. I would allow the sound to take me wherever it would. Some sounds were lush and full of life, others seemed sparse, and some would invoke a feeling of hot, dry, arid landscapes. In most instances this is because the sounds were from a creature that lived in the appropriate environment in the real world, but instead of working with names and descriptions, I worked with feelings, and I think this allowed for some nice choices of content.


We have been using the State system in Wwise to control various aspects of the sound and music. We already had a day and night State, but I wanted to expand on this. So we added dawn and dusk States to create transition points between the main day and night States. This is used for both music and SFX. But, for the species of birds and insects, I crafted each one individually so that I could align them with how they made me feel. To me, certain bird sounds just didn’t sound like how I would hear them at dawn. They had more of a feeling of circling overhead at noon. So these sounds only triggered once full day State was active. Some insects I implemented to trigger right at dusk and others not until full night had fallen.



In this example the insect sounds will be active at dusk and during the night, but fade when dawn arrives. And this particular species is only active during the Spring and Summer months.

The dynamic structure for the birds used individual bird calls, triggered at various rates, depending on exactly which species I was working on. Smaller birds tend to twitter more often than larger birds. This system also allowed me to alter the trigger rate dynamically. So at dawn and dusk I could create the “massed chorus” effect of many birds, all singing more often. Over the course of the day, the trigger rate would drop off so that at noon when the day was hottest the birds were barely present.

From a seasonal point of view, I set many birds to be most active in springtime, when many creatures are mating, then slightly less active in summer and in some biomes completely replaced with noisy summer insects. In some regions, the same bird species are active through spring, summer, and autumn, and then they fall quiet during the cold winter. In other biomes, a specific bird may only be active in spring, and then through other months different birds or insects become more vocal.



The Grasslands_Birds consists of four species and in this example the BellsVireo has been expanded to show it has 16 sound files that make up the full pool for that species. Each bird has a varied number of sound files depending on the type of bird and what sound files I had available for that species.

The advantage of the Wwise State system is that I could easily assign and tune each species object to be unique, but also, so they would blend and dovetail nicely. The other useful implementation technique was to add all my biome specific sound objects into a single Event. The main game objects I used for implementation into Unity were the biome specific trees. Each biome had its own tree species. As birds and insects generally gather in trees, this was logical from a narrative point of view and provided objects spread throughout the game world, as emitters. This also meant I had to sync only one Event to a tree prefab and it would be instanced across the whole world. (This was important because the player can collect plant seeds and place them anywhere else in the world.)

Initially, I had both a bird Event and an insect Event attached to each tree prefab. Then I realized I could simplify this. A Wwise Event can contain multiple sound objects. So I could place each of the different bird and insect species which I wanted to inhabit a biome into a single Event, and attach that to a tree prefab. The State system meant that even though there might have been 4-6 sound objects in the one Event, each would only play at the specific day/night and seasonal State defined for it. Each of these objects could have unique effect and attenuation behaviour. So again, the drop-off range of species could all be tuned to present unique behaviour within the world.



A single Event can contain all the sound objects I need to produce the spatialized environment sound for a biome. Each of the objects in this Event will only trigger when their appropriate States are met. So, even though there are 6 sound objects here, only one usually plays at any one time. This makes implementing into the game much simpler once the system is defined.

As you walk through the world, there is a true spatial environment around you. Trees may include two or three different species of birds within them, and each species had a range of bird calls. So, the entire system generates a spatial dynamic environment. If you chop down a harvestable tree, it stops emitting its related sounds. If you deforest an entire biome, its environmental audio will reflect this.

Each biome also has birds in flight. These are very basic animated shapes. But they also have sounds attached to them, so they sporadically emit a bird call as they fly by. This final element really helped to sell the feeling of a dynamic and living world.



The weather system for each biome is also unique. While there is a general wind sound through all biomes, the forest also has a wind-through-leaves sound that is attached to the trees and emits from that location. Rain in forest areas is the sound of rain on leaves, while in the grassland areas it's a lighter rain on ground sound. Alpine and desert areas have very different wind sounds compared to other environment types. To help support the overall narrative, all of these choices were made using the same approach: the “how does it make me feel” approach. 

Making the audience feel cold in the winter months and in snow biomes,  and hot and dry in the desert, can be achieved more successfully when the audio fully supports the visual effects. In fact, often the audio can be more evocative and can trigger emotional feelings from the audience, more than visual changes can. Keep in mind that apart from the basic flying bird shapes, none of these birds or insects exist in the game as objects, they only exist as sounds. So the world is vastly populated with a great and diverse selection of lifeforms that exist only because of the audio. In this regard I got to decide and create much of the ecosystem of the world of Gamea, and this helped create a lush environment without having to create dozens of models and lots of complex code.

All of the techniques that I applied to creating the environmental audio for Yonder: The Cloud Catcher Chronicles were taken from my experience doing research for VR/AR/MR implementation. For the New Realities, we are striving for more detailed and precise surround spatial environments; we want to immerse the audience into these worlds and make their experience more engaging. But, I realized that many of these techniques were just as valid for a "traditional screen" format game world. So movement of the camera produces a similar “world rotating” effect in the audio just like head tracking does in VR. This is because all the weather sounds, such as wind and rain, are set at four compass points; and all of the environmental birds and insects are localized throughout the world inside the trees. So Gemea is a dynamic virtual environment in many ways, and the player’s experience should be far more engaging and enjoyable because of it.


This article was originally published on Gamasutra

Stephan Schütze

Spatial Audio Producer & Consultant

Sound Librarian

Stephan Schütze

Spatial Audio Producer & Consultant

Sound Librarian

Stephan Schütze has worked within game audio production for close to twenty years. He is a composer, sound designer, location recorder and spatial audio practitioner. The broad and varied list of audio production skills Stephan has developed over his career, and his experience working with some of the leading companies in New Reality technology provided him with the perfect opportunity to create the first book on audio production techniques for this newly evolving technology.



Leave a Reply

Your email address will not be published.

More articles

Creating compelling reverberations for virtual reality

In this series, we are taking an extensive look at past, current, and emerging reverberation...

28.2.2017 - By Benoit Alary

Planet Coaster - Crowd Audio : Additional Layers (PART 3)

Part 1. Scaling Ambition Part 2. The Crowd Soundbox System Part 3. Additional Layers ...

18.7.2017 - By Planet Coaster

Implementing Two Audio Devices to your UE Game Using Wwise

First, let me introduce myself. My name is Ed Kashinsky and I am a sound designer and musician from...

20.5.2020 - By Ed Kashinsky

Tell Me Why | Audio Diary Part 2: The Music

The music of Tell Me Why was intrinsically designed to support the narrative and the emotions of the...

17.6.2021 - By Louis Martin

GME Voice Chat System in Suspects: Mystery Mansion

Introduction This blog post is about the voice chat system in the game Suspects: Mystery Mansion by...

12.8.2022 - By Felippe Lopes

A Speed Run Through the World of Voice Design

Since the first snippets of speech found their way into games in the 80s, developers have wrestled...

26.1.2023 - By Charles Pateman

More articles

Creating compelling reverberations for virtual reality

In this series, we are taking an extensive look at past, current, and emerging reverberation...

Planet Coaster - Crowd Audio : Additional Layers (PART 3)

Part 1. Scaling Ambition Part 2. The Crowd Soundbox System Part 3. Additional Layers ...

Implementing Two Audio Devices to your UE Game Using Wwise

First, let me introduce myself. My name is Ed Kashinsky and I am a sound designer and musician from...