Loudness Processing Best Practices, Chapter 1 : Loudness Measurement (PART 2)

In his previous blog, Loudness Processing Best Practices, Chapter 1: Loudness Measurement (part 1), Jie Yang (Digimonk) explored challenges, considerations, and compromises surrounding audio standards for various platforms and content types. He also covered a few examples and comparative readings of different measurement units and tools. In this blog, Jie Yang (Digimonk) covers how to make sense of the various options and standards, to be able to make better decisions for our game audio projects and effectively create better and more enjoyable user experiences.

Translator's Note:

This series highlights audio best practices from China. China topped the world gaming revenue charts in 2016. And the Wwise Tour 2016 China stop not only featured popular (50 million daily active users) Chinese games using Wwise, but also boasted over 200 attendees from the local game audio community. Therefore, we were certainly intrigued to take a deeper look into game audio practices in China. By translating blog articles by one of the most influential audio designers within the Chinese gaming industry, we aim to help better understand the audio community and the culture of this vast territory. To the best of our knowledge, this will be the first-ever effort in translating Chinese audio tech blogs into English.

Translation from Chinese to English by: BEINAN LI, Product Expert, Developer Relations - Greater China at Audiokinetic

Screen Shot 2017-05-25 at 10.56.07 AM-1.png

To make loudness more intuitive to you, I found a nice graph that will help you understand the standard loudness ranges of different media platforms:

In recent years, more and more loudness meter plug-ins became available. Veteran companies in this area, like TC Electronics, continue to make reliable products that are easy to use. (They look good too!) You can clearly observe the range of loudness changing over a period of time to help you identify the current peaks and means via a fast-moving digital pointer.

Note: LC2n is not only a loudness meter, but also a loudness processor (it can work alone or as an offline plug-in). Its meter-only sibling LM2n is apparently just a meter. The measurement of the original recording is to the left of LC2n UI, and the processed result to the right.

LC2n provides some common loudness presets, such as ones for iPhone and iTunes, which can be selected according to your needs and can take effect directly. But, currently, this tool can still be annoying. For instance, it always renames the sound files by adding the dB values to them, and there is no 'undo' for this.

Here is an excerpt of Dave Weckl's recording published by GRP in 1990 (44/16 WAV):

The result of a standard loudness test (the "radar" is on the left):

When we switch to the Mobile mode to optimize loudness for cellphones, the left radar displays the result for the actual sound recording played from a cellphone, while the right radar indicates what the recommended loudness for the standard would look like:

The reason I chose a GRP recording is because GRP, in my opinion, represents the highest quality from the golden years of the music recording industry. It adopts analogue recordings and digital mixing technologies, and, therefore, is a classic example of controlling frequency response and loudness. This is a great example. The left radar indicates the loudness when playing the sample from a cellphone. In the processed result for mobile platforms, the average loudness LKFS increased by about 3 dB, but there is no change in LRA. Let's take a look at the filename:

In the above picture, the first file is the renamed file, indicating the target loudness result. The following screenshot shows the waveform after the loudness processing:

Compare that with the original waveform:

In the above example, there is a 3 dB difference between the original and the processed versions, yet our perception may give us the idea that it is a higher number. The RMS of the processed sample is at -19.3 dB, below the usual level of -16 dB, but it sounds very loud. Suppose that after the loudness processing we cut off the sound recording's frequency portion above 16 kHz (API 560EQ):

From the look of the waveforms, the loud parts do not differ much from one another. The measured LKFS loudness is -16.9 dB (-16.1 before the spectral cut-off).

Yet, with the perceived loudness we hear a big difference:

This example tells us that frequency response has a big impact on our loudness perception. It is our major leverage for loudness processing and, also, for mixing.

According to A-B comparisons and in-game comparisons, most sound samples and music perform well on iOS or Android phones at a loudness of -16 dB LKFS. When processing a sound sample, LC2n will adjust its loudness in a non-linear way instead of simply moving its peaks or compressing it. I tested LC2n with both sounds that are too loud and those that are too quiet. Both results were great. There was largely no audible distortion. The relative levels between the instruments in the music are well preserved. The overall frequency response can retain its aural integrity after being amplified or attenuated by 12 dB. In comparison, the traditional peer plug-ins often cause discernible distortions when scaling the volume up or down.

When you observe loudness changes:

Don't be nervous about the occasional cases where short-term loudness goes above the standard, as long as there is no clipping. It's acceptable for short-term signals (400–3,000 ms) to overshoot the range. If you truly can't get past this, then only money can appease you; the entry-level tool will cost you about the same as an average car. Of course, much can be discussed on how to catch clipping. For example, a red meter does not necessarily mean clipping is really happening, whereas no red showing does not guarantee clipping-free either. We'll talk about this at another time.
You must always keep this in mind: we want a reasonable and appropriate dynamic range. It would be inappropriate if the range were too small. Ultimately, we would like to keep the dynamic frequency response within a proper range. For video games or soundtracks, sounds don't come out alone. The things to watch out for with a meter are usually relative changes.
Meters just give you references. The ultimate judges are the works at hand and your own aesthetics. Especially when things like your listening environment is suboptimal, your DAC and headphones are subpar, or you are in a really bad (or really good) mood, your brain will easily misjudge loudness. Prolonged headphone usage is also problematic. Headphones create illusions about the relative depths and dynamics that demand careful examination, because the sound is too close to you. Indeed, effective long-term ear training is a must for professionals. For instance, you must be able to tell (on your first listen) the true culprit of a sound being too loud, such as a certain overly loud frequency band or the overall loudness being too high.

TC's PPM true-peak meter is one of the better ones on the market. It retrieves amplitude data from digital samples directly and can display the delta level between two samples with a very small error. Many other PPM meters don't actually have such precision. Some even present a 3 dB error. In other words, those PPM meters are almost only as useful as VU meters, which can give you approximate relative peaks only. Of course, sometimes we do need meters that show relative peaks, such as during mixing. There is another issue regarding precision PPM true-peak meters: the precision of amplitude offsets on each digital sample during amplification and attenuation. Don't underestimate the difficulty of making this right. Many manufacturers provide problematic features in this area; they can be so off. That's why such plug-ins by Flux, TC, Sonnox, and McDSP are so expensive.

LoL ... Some plug-ins are a big surprise, such as WLM from Waves. Some Western engineers and I tested it and found that it is not that accurate. If you test the same recording multiple times with it, you can actually end up with different measurements. Try testing an arbitrary WAV music recording with WLM, and you will probably find that although your LKFS values are the same, each time you test the same recording, the scanning processes will differ so much. This made me wonder how it was coming up with its final results. Furthermore, WLM's UI shows no continuous loudness evolution over a period of time; instead, it only displays the current status and the mean values. You could indeed export the measurement as CSV data and make a pretty graph, but most meters like this cannot allow you to observe in real time the continuous changes in loudness. Such continuous changes and the relative loudness are what we care about most. Otherwise, there would be no point in using such complicated meters; a blinking light would suffice!

Another cheap yet reliable method is to open the recording with Sound Forge, use its built-in Normalize plug-in, and switch to RMS mode. This gives you the average RMS loudness. Note that it has an option: 45 dB equal-loudness weighting (signals lower than -45 dB are excluded from its analysis).

I recommend not to use this weighting because you will never know whether the results are correct or not. I compared Sound Forge Pro 11 and Sound Forge for Mac 2, and found that even their measurement results are sometimes different!

Still using that Dave Weckl's recording, below is the RMS reading before loudness processing in Sound Forge:

Here is the RMS reading after processing the recording using TC LC2n:

We can easily notice the differences in RMS and LKFS, before and after.

If you are experienced with RMS, you might face a problem when working with the LKFS system: What is the relationship between RMS and LKFS? Here is a tech note offering an accurate yet quite confusing map:

This map focuses on numbers, and is useful as a reference only. In fact, there is no simple linear relationship between RMS and LKFS, and there shouldn't be! That's because LKFS contains "perceptual weighting" while RMS is just the electric power. Therefore, a comparison like the above is not very useful.

Please note, RMS only represents electric power, different from the loudness you hear! When measuring and adjusting loudness using RMS, you often need to resort to your hearing experience to determine whether or not the frequency response and presence of two sounds are the same, then compare with the numerical measurement. Especially with low-frequency content, your monitor headphones usually cannot function precisely below 50 Hz, but that portion can make your RMS high! This is why Sound Forge introduces this equal-loudness weighting for you to select. To easily test my point, you only need a -20 dB, 30 Hz low-frequency signal to blend into an ambience. The RMS and LKFS will present very different numbers. You probably won't hear this low frequency at all and the waveform probably won't look much different before and after cutting off that frequency band. This will cause discrepancies between your meter reading and your hearing. But trust me, this problem is so common that at least 80% of the sound assets used in a game will face it. Therefore, a good sound card, good monitoring equipment, good ear training, and good monitoring practice are very important!

I usually observe and verify loudness with a reliable spectral analyzer plug-in. Because the final part of the production is done with Sound Forge, it is better to train yourself to understand the relationship between RMS and perceived loudness. The most common case is if an asset's RMS is above -16 dB, then cutting off the content below 30 Hz (by 24 dB/Octave) will bring it close to -16 dB and you won't feel anything lost at the low end. It could be that your monitoring equipment couldn't respond to the frequency band or the frequency band is too quiet but still great enough to increase the RMS. It is also (more) possible that other frequencies masked 30 Hz and below. You can only figure out the actual reason by using a good spectral analyzer. Please note that Wave's spectral analyzer is not good enough, especially its low-frequency analysis! It can't even beat Sound Forge's built-in analyzer. The low-frequency analysis of the latter sometimes could display what other tools fail to: low-end intensity. Consider that superpower.

Note: You can find some research results regarding the relationship between RMS and LKFS here. Look for the article Momentary Loudness RMS Filter Options.

Actually, loudness measurement is not as much a mystery as people may think. For John Does like us, we can just learn to read the meters. Beyond that, the hardest part is how to confine sounds within the same standard or expected loudness range, and do so consistently for thousands of assets in a project. How do we make sure that the final output loudness is in the expected range?

How is this possible?
Is it even required?
And maybe a third question: How do we make this happen?

If you truly want to follow the loudness standards, yet find those alien jargons confusing, then the simplest method would be to memorize the most frequently used RMS or LKFS loudness standards, and do your best to work your assets towards those standards.

We will talk about how to use those plug-ins some other time. Here, I would like to introduce my own routines for using loudness meters:

Most of the time I can rely on my hearing to figure out the loudness of an asset (within about a 3 dB error range against the software that I use). But for some sounds containing special frequency portions and fast-changing elements, I usually verify with loudness meters. Such cases often involve unusually strong frequency content below 60 Hz or above 4.5 kHz; other cases include soft sounds that are meant to be loud like the foggy visual effects in Moonlight Blade.
For complex cases such as music and soundtracks, I would combine hearing and loudness meters.

Some personal best practices:

My hearing, over volume meters or loudness meters, has the final say on loudness, no matter what. I will strive for achieving the target loudness numbers, but will put more weight on the overall frequency response. The EQ of individual assets drives my decision making and revisions. It's only when I'm lazy that I touch the volume levels directly. I will consider using compression or reverb to tweak the result only when the basic operations fail. Regarding the loudness standards, I will do my best to obey the rules when I can, but it's OK for transients to occasionally go above the standards. The loudness standards mainly regulate average loudness over a period of time instead of transient loudness. My practice was achieved through deliberate training over a long time, and includes my own aesthetics.
Whenever seeing the peak volume or loudness above or below your expectations, don't rush into adjusting the volume or compression. Instead, you should analyze the situation: Is it caused by some frequency bands, or an overall problem? Act differently according to your answer.
Maintaining a good monitoring routine is the best way to determine loudness. Stable, loud enough but not ear-aching monitoring volume, and the ability to tell frequencies are skills that require constant training. When I'm under the weather or going through a mood swing, I will listen to my favorite records for 5-10 minutes to confirm where my ears set the bar regarding sound pressures. Of course, my monitoring volume (headphones and speakers), usually stays the same. But, when it occasionally has to be adjusted, I reset it to my default level when I am feeling well again. My monitoring volume configuration is based upon this rule: Keep the volume as high as possible without ear fatigue after listening to a 40-50 minute long CD. An overly-low monitoring volume would filter out lots of details!
No monitor speakers or headphones are super accurate. Every time I start using a new pair of monitor speakers or headphones, I will spend time finding the difference between my listening habit and the new equipment, and then memorize the difference. For example, Genelec 8060 has serious decay below 100 Hz, and may cause an inaudible 40 Hz and below. Therefore, if 40 Hz sounds full to me using 8060, then it would mean an over-saturated 40 Hz if I were to use other speakers.
About monitor speakers, audio cables plugged into sound cards, and other configurations, the analog I/O terminals of common professional sound cards, mixing consoles, and monitoring controllers offer two options: -10 dB or +4 dB. +4 dB is recommended for studio environments, which can allow you to hear more details with the same monitoring volume. You should certainly use professional cables and plugs to match the setup. Such attention to detail will make your work sound more professional and your job more enjoyable.

Some pragmatic people would argue that OEM cellphones in China constantly compete to be louder. The streets of small towns are full of people playing songs out loud from their phone speakers, almost as loud as a megaphone. (Why bother about loudness when living in a world like this?) We as professionals know that hearing requires cherishing. So, we are responsible for cultivating better hearing habits and aesthetics. Who else would care to do it, right? Hearing habits and aesthetics can be learned.

This blog series aim to discuss such issues. A well-trained sound designer or post-production engineer can achieve good loudness results through hearing under proper professional monitoring conditions. More often than not, reading the meters is only for reference or further precision. In other words, when hearing something too loud, you will still have to know where it is exactly, by how much it is louder, and how it got there. After all, it is not quite efficient to read meters when you have tens of thousands of assets in a game, and your hearing should be the more efficient measurement tool. Beside Waves and TC, other loudness meter manufacturers worth mentioning include:

All the above is, at your discretion, up for discussion. Please correct me if I'm wrong; it will be greatly appreciated!

Stay tuned for my next chapter: Chapter 2: Loudness, Dynamics, and their Processing Techniques.

Where 40,000+ audio professionals share interactive audio ideas, news and beyond.

Loudness Processing Best Practices, Chapter 1 : Loudness Measurement (PART 2)

Sound Design

Jie Yang (Digimonk) | June 13, 2017

Jie Yang (Digimonk)

Sound Designer, Audio Director

Tencent Aurora Studio

Jie Yang (Digimonk)

Sound Designer, Audio Director

Tencent Aurora Studio

Comments

john smith

October 29, 2018 at 06:42 am

Leave a Reply

Your email address will not be published.

Also from this series

Loudness Processing Best Practices, Chapter 1 : Loudness Measurement (PART 1)

Loudness Processing Best Practices, Chapter 2 : Loudness, dynamics and how to process them.

Loudness Processing Best Practice, Chapter 3: Scalable Loudness Processing for Games

Also from this series

Loudness Processing Best Practices, Chapter 1 : Loudness Measurement (PART 1)

Loudness Processing Best Practices, Chapter 2 : Loudness, dynamics and how to process them.

Loudness Processing Best Practice, Chapter 3: Scalable Loudness Processing for Games