In this two-part blog series, we'll be going over the way we designed the sound system for Clocker. Hello, my name is Yu Tan from Salt Sound Studio. I was responsible for integrating Wwise and Unity, creating sound effects and designing the interaction mechanism.
Clocker is an indie adventure puzzle game, developed by Wild Kid Games. The storyline focuses on the relationship between a father and his daughter, with time as the connecting bond. The game features a unique time mechanism, compelling storylines and challenging puzzles to solve, and is delivered in an artistic hand-painting graphics style. It tells the story of a father and his daughter, using a two-character narrative technique, and allows you to control either of the two characters separately when solving puzzles during your adventure.
The gameplay consists of using a clock to control NPC actions and logic lines. Logic lines are alternative storylines that develop as a result of changing the order of events with the clock. As the father, you are trapped in a time-frozen space. You can, however, use the clock to go back and forth through time, in order to change the order of events. As the daughter, you live in a normal world and will get to see how everything pans out, eventually.
Initially, because Clocker is a small game, we thought there wouldn’t be a large number of sounds involved. But after running some tests, we realized it required a much greater quantity of sounds than we thought.
The image shown above is a representation of two of the game's logic lines. Each logic line represents a set of NPC actions.
As players enter Scene 6:
- Thief Logic A is triggered.
One of two things follow:
- The thief moves forward, then opens the door. Or,
- Thief Logic A stops before completing (i.e. the thief doesn’t open the door), then no other logic events are triggered. Note that many NPCs have a logic line with multiple action phases.
Then, if Thief Logic A completes:
- Homeowner Logic B is triggered. And as a result, the homeowner runs to the thief.
Now, as you control the homeowner, you have two options:
- You let the homeowner run towards the thief and hit him. Or,
- You make the homeowner go back to Homeowner Logic A, then turn to control the thief.
From there, either Thief Logic E or D is triggered, depending on what action phase the homeowner is in.
We could go on and on, but it’s clear the game logic is complex. In the image shown below, you can see how the character States change in the game. The clock has two timelines to represent the different character States.
With the above description, we've outlined the gameplay and the game logic. The game includes 70 NPCs and 682 logic lines, so it made sense to use cascade structures to control how the storylines proceed.
With cascade structures, you have to keep track of the playback position of NPC sounds. Meaning, you need to record the audio playback's time information as the NPCs are deployed. This way, we can make sure that NPC sounds are always in sync with the game visuals, as their logic lines stop and resume. This also enables these sounds to playback, forward and backward.
With the above conditions met, we grouped the in-game sounds into 5 parts:
1. Father’s Time
Controlling time to travel back and forth through time.
Recording the exact playback position of NPC sounds.
2. Daughter’s Time
Reflecting the gradual change in time.
Extending static animation clips.
Solving possible voiceover issues.
3. Game Functionality
Controlling the playback speed relative to controller pressure.
4. Interactive Music
Guiding the player’s actions with in-game music.
Designing the interactive music for Clocker.
5. Performance Optimization
Managing SoundBanks in the project.
Compressing the audio for different platforms.
The Unity engine's audio capabilities not being capable enough to pull this off, we naturally turned to the Wwise sound engine to get more advanced features.
Implementation in the game
Controlling the time to go back and forth
We looked at different solutions to match the audio to the game dynamics, but individual sound effects were just not immersive enough to convey the time States of each character. And had we used these sound effects to sync with the animation clips in-game, it would have been difficult to switch to precise time points in the current logic line. In the end, we decided to create new sound effects for each character's entire logic line. This way, not only could the sound effects for each character be maintained efficiently, we could also overcome the challenge of recording the current playback position as players move back and forth through time.
In the images above, you can see the Events we created for each character’s logic line, as well as the Switch Groups that were used when switching between logic lines. Using these assets, we were able to quickly find the required Events, based on the scene names, current NPCs and their logic lines. This saved us a lot of time and effort.
Creating sound effects for individual actions or objects was not a good idea. As previously mentioned, they were not immersive enough to convey how players go back in time. To get better results, we decided to reverse the sound effects for the entire logic line, rather than simply reversing the individual sounds, and added a layer of low-frequency sounds to match with and support the game's time mechanism. By playing forward/backward sounds and switching between logic lines in real time, the change in playback State from Logic A to Logic B was reflected perfectly. This was a lot of work, but it worked out well. In the image below, you can see the sound waves for an entire logic line.
To determine a character's current timeline, we defined two Game Parameter values (1 and -1). 1 means to play forward sounds, -1 means to play backward sounds. They are triggered when the player moves forward or backward, respectively. This way, we were able to switch between logic lines and reflect the change in playback state properly. In the image below, you can see how we set the Game Parameter values.
To play forward/backward sounds while making sure they do not overlap, we used Wwise's Blend Containers to control the playback position. This way, we were able to avoid having both sounds being played at the same time. In the image below, you can see how we set the Blend Container.
Recording the exact playback position of NPC sounds
As previously mentioned, to record the exact playback position, we created both forward and backward sounds. The sounds are played against the in-game timeline. This is done to quantify the percentage of elapsed playback time and uses the SeekOnEvent API function to go back and forth through time. In the image below, you can see how we used this function.
In order to return to the last playback position, we only need to record how much playback time has elapsed for the current animation clip and sound, then read that information prior to the NPC logic lines being triggered again.
Reflecting the gradual change of time
When the daughter enters, time and space are frozen, but gradually return to normal. In the image below, you can see how this works.
In the father’s world, everything is black and white. There are no ambient sounds being played in the father's scene, so as to heighten the sense of frozen time and space. When the daughter enters, everything gradually returns to normal, including color. To sync up with the game visuals, the ambient sounds gradually evolve in the same manner. In order to get even better results, we split these ambient sounds into layers, including wind blow, small objects and critters. Their volume increases gradually.
We also defined two Game Parameter values (0 and 1) and a percentage value, for the size of the scene. This way, the ambient sounds could be faded-in properly, relative to the daughter’s current position.
In the Blend Container, the layered ambient sounds fade-out, based on the following curves, and all of them blend together. In the image below, you can see how we set the Voice Volume and Fade parameters.
Extending static animation clips
In the game, the daughter has her own independent logic lines. They are not triggered by the father.
When the daughter enters, the father’s interrupted logic line resumes. Let’s say a NPC logic line stops after completing only 30% of its triggered events. When the daughter gets close to this NPC, the sounds resumes from that playback position (i.e. 30%) until the father’s logic line completes. Her own logic line is triggered based on her father’s current actions. That’s how we extended the cascade structures we mentioned earlier.
In the image below, you can see that when the daughter enters the scene, the police officer resumes his logic line, based on the father’s logic line until it completes. He then decides whether or not to arrest the thief. The daughter’s animation clips are looped, so we have to stop the playback of the current sounds to avoid auditory fatigue.
Solving possible voiceover issues
In the game, we used modal particles in most of the characters’ dialogue, to reflect their feelings. These modal particles help players identify NPCs’ emotional states in game. At first, we actually considered using voice overs (VO) to that end. However, the producer refused for two reasons:
- The game has multiple finales, and VOs may expose the hidden finales.
- This was an indie game with a limited budget.
So in the end, we used modal particles instead of VOs.
In order to be able to bring NPCs to life even without VOs, we decided to use modal particles to reflect their emotional states. For example, there is a middle aged man in the game, who is not that easy to get along with. As such, we used modal particles to reflect his caustic tone of voice.
Controlling the playback speed against the controller pressure
In the game, the logic's speed is controlled by the controller pressure, to get the sounds to sync with the game visuals. In the image above, you can see how this works.
We used the built-in Time Stretch effect in Wwise to get the global sounds to reflect the acceleration and deceleration. In the image below, you can see that we defined two Game Parameter values (0 and 1). These Game Parameter values translate controller pressure values to determine the Time Stretch rate.
Deceleration for game visuals occurs at a factor of 10, while the Time Stretch effect occurs at a factor of 15. So the transition is very smooth during playback. 100 means no Time Stretch, 200 means double the duration, and so on. The maximum controller pressure is specified as 100, and the minimum speed is specified as 1100. This ensures that the Time Stretch rate matches the animation clip. In the image below, you can see how we set the Time Stretch effect.
In Part 2 of this blog, we will look at how we designed the interactive music for Clocker. Stay tuned!