Loudness Perception in Music

Production Science 7 min read

Understanding how humans perceive loudness is not just academic — it directly affects how you mix, master, and evaluate your music. Many mixing mistakes come from fighting against (or being unaware of) the quirks of human hearing.

How we perceive loudness

Human hearing is not flat. We do not perceive all frequencies at the same loudness, even when they are playing at the same physical amplitude. This is described by the equal-loudness contours (historically called Fletcher-Munson curves).

Key facts from the equal-loudness contours:

We are most sensitive to 2-5 kHz. Sounds in this frequency range are perceived as louder than sounds of equal amplitude at other frequencies. This is because the human ear canal resonates at around 2.5 kHz, amplifying these frequencies physically before they reach the eardrum.

We are least sensitive to very low and very high frequencies. A 40 Hz tone must be about 60 dB louder than a 1 kHz tone to be perceived as equally loud.

The curve flattens at higher volumes. At loud listening levels, we perceive the frequency response as more "flat." At quiet levels, the bass and treble drop off perceptually. This is why music sounds "better" (fuller, more balanced) when played louder.

Why "louder is better" — and why it is a trap

The third point above is critical for producers. When you turn up the volume, the perceived bass response increases, the mids feel more present, and the overall balance improves. This is a psychoacoustic effect, not a quality difference.

This creates a systematic bias in production:

When A/B comparing two versions of your mix, the louder one will always sound "better" initially, even if the quieter one is objectively a better mix.
After mastering with heavy limiting, the master sounds "better" because it is louder. But when you level-match it to the original, you might discover you have destroyed dynamics for no real gain.
The classic "it sounded great in the studio, awful in the car" problem is often a loudness adaptation issue — you mixed at high volume where the frequency balance sounded flat, but at moderate listening volumes in the car, the bass and treble drop off.

The practical implications

Level-match everything

Before comparing anything — two mix versions, pre/post mastering, your track vs a reference — match the perceived loudness levels. Use a LUFS meter and adjust the gain until both signals read the same integrated LUFS. Now you are comparing quality, not volume.

Mix at moderate volume

Mixing at loud levels causes two problems: it flattens the equal-loudness contours (giving you a false sense of flat frequency response) and it causes ear fatigue faster. Both lead to bad decisions.

Mix at a level where you can comfortably have a conversation. If you cannot hear the bass clearly at this level, the bass needs to be louder in the mix — do not solve this by turning up the monitors.

The "phone speaker" check

When you check your mix on phone speakers, you are hearing it at low volume with almost no bass reproduction. If your kick disappears entirely, it lacks mid-frequency content. The equal-loudness contours tell you that at low volumes, only the midrange is truly audible. For hard dance, this means your kick needs harmonics in the 1-5 kHz range to be perceptible on small systems.

Frequency masking

Masking is when one sound makes another sound inaudible or less audible, even though both are playing. Two sounds at similar frequencies mask each other more than sounds at different frequencies.

Masking is strongest when:

Two sounds are close in frequency
One sound is significantly louder than the other
Both sounds are playing simultaneously

In hard dance mixing, the most common masking problem is the kick masking the bass and low-mid elements. The kick is so loud and so wide in frequency content that it temporarily makes everything else inaudible during the hit. This is partially solved by sidechain compression (ducking the masked elements while the kick plays) and partially by arrangement (not layering too many elements during kick sections).

Temporal masking

Masking does not only happen simultaneously. Pre-masking and post-masking affect sounds that occur near a loud event in time.

Pre-masking: A quiet sound occurring up to 20 ms before a loud sound is harder to hear.
Post-masking: A quiet sound occurring up to 100-200 ms after a loud sound is harder to hear.

In hard dance at high BPM, the kick hits every 350-400 ms. Post-masking from each kick extends about halfway to the next kick hit. This means subtle elements (quiet atmospheric details, reverb tails, low-level percussion) are only clearly audible during the narrow window between the post-masking of one kick and the pre-masking of the next.

This is another reason hard dance mixes benefit from simplicity. Subtle details get masked anyway. Focus your energy on the elements that are loud enough to punch through.

Ear fatigue

Your hearing sensitivity decreases the longer you listen to loud music. After an hour of mixing at high volume, your perception of the 2-5 kHz range (where your ears are most sensitive) drops. This causes you to unconsciously boost those frequencies to compensate, leading to harsh, brittle mixes.

Combat ear fatigue:

Take breaks. 10 minutes away from speakers every 45-60 minutes.
Mix at lower volumes. Less fatigue, more accurate decisions.
Reference frequently. Play a commercial track you trust every 20-30 minutes to recalibrate your perception.
Check the next day. The best quality check is fresh ears. Listen to your mix the following morning before making final decisions.

Dynamic range and perceived loudness

A track with 10 dB of dynamic range (difference between the quietest and loudest moments) feels more impactful than a track with 3 dB of dynamic range, even if the peak levels are identical. The contrast between quiet and loud is what creates impact.

In hard dance, this manifests as the contrast between breakdown and drop. A heavily compressed mix where the breakdown is almost as loud as the drop feels lifeless. The kick section should be significantly louder than the breakdown. That contrast is what makes the listener feel the drop.

This is counterintuitive — to make the drop feel louder, make the breakdown quieter. Do not compress the entire track to the same level. Preserve the dynamics.