Jim Austin  |  Dec 18, 2019

Subjectivist audiophiles have long maintained that long-term listening is necessary to assess the quality and character of an audio component. Scientific testing methodologies such as ABX, which require quick and conscious evaluation of a change in the sound, have long struck many of us as insufficient, seeming to miss much that affects our enjoyment of music. A pair of Genelec researchers—Thomas Lund, an audio professional with a medical background, and Aki Mäkivirta, a research and development manager and a Fellow of the Audio Engineering Society—have published two articles (footnote 1) on the science of hearing and perception, and their findings appear to support such views. Among their observations:

  • The "perceptual bandwidth" of humans—the maximum size of the [sound] datastream into our brains—is remarkably small. Experts put it in the range of 40–50 bits/s: compare that to the worst kind of MP3, which has a data rate of 10s of kilobytes/s, more than 1000 times larger (footnote 2). (No wonder we're easily fooled by lossy encoding.) Perceptual bandwidth "should generally be considered a scarce resource," the authors state.

  • We are not passive recipients of sensory information—how could we be with such a low perceptual bandwidth? Sensing is an activity; attention is a tool we deploy to select which information we take in.

  • Only a fraction of the perceptual information we take in is available to our conscious awareness; much of it goes unnoticed. Yet, it still affects us. We—or anyway our brains—even solve problems—unconsciously—using unconscious information.

Together, these ideas imply a perceptual landscape far different from the one that people long assumed: Through long experience, we build an internal model of reality and then reach out to test it against scarce, select sensory input. Our ability to function well and make accurate judgments depends on the accuracy of our internal model and the precision with which we correct it with scarce, carefully selected external stimuli (footnote 3).

Models of reality are contextual: An outfielder's long experience allows him to get a fast start in the right direction when a ball leaves the bat. Our own long experience allows us to detect small changes in what we hear that others cannot (footnote 4). A major theme of the articles is that listening takes time. Indeed, the titles of both feature the phrase "Slow Listening"—surely an allusion to the slow food movement that started in the late 1990s (footnote 5). Time matters in subjective listening tests, the authors conclude, in various ways:

  • Music and language training over years affect our ability to detect short, transient sounds.

  • Becoming familiar with a reference audio system takes time. "Based on a limited perceptual bandwidth and 8 hours of dedicated listening per day, getting to know a room and equipment in any detail would take at least a week, but assuming years would be safer," the authors write. "Subjective tests, even producing repeatable results, may have little relevance if confined in time"—a challenge for audio reviewers for sure.

  • Much that matters in perception goes on below the surface, without our explicit awareness. How do we know when our unconscious rumination has matured to the point that it's time to render judgment? We can't, so we need to factor in extra time.

What, then, are we to make of ABX tests in which test subjects compare a version of a few seconds of music, jangling keys, or whatever against a reference? Such tests are good for many things—see this month's Industry Update—but far from the last word on sonic and musical significance. When we rely on them too heavily, we miss things.

Here's one more idea from the article that many audiophiles can relate to. The authors cite an article published in Nature Neuroscience in 2014 that establishes that noise well below the threshold for physical damage can cause auditory stress and impact our hearing. Nontraumatic sounds, including those with "excessive high frequency energy, lack of 'quiet transients'" (which are removed in lossy encoding schemes) and "interaural strangeness or unnaturalness," can cause undesirable changes in the "auditory brain," even if they do not damage hair cells. Aural stress, listening fatigue, and the resulting impairment in our ability to discriminate small differences is familiar to audio reviewers. Scientists have now corroborated our subjective experience.

Which raises another possible objective of audio systems. Accuracy—fidelity—is, for most serious listeners, the benchmark we measure our systems against, whether we measure fidelity by objective or subjective criteria. But other valid criteria exist. Maybe some listeners just want sound that minimizes, or even alleviates, stress, whether through second-harmonic distortion, suppressed response in the presence region (aka BBC dip), natural interaural relationships, or whatever. Maybe some people just want their sound system to sound good.

Speaking of sounding good: This month's issue contains the first installment of a new Stereophile column. In Revinylization, Art Dudley reviews the most important of the latest reissues on vinyl. You'll find the new column on p.125 and here.

Footnote 1: The company's managing director, Siamäk Naghian, is a coauthor of one of the articles. See here and here.

Footnote 2: I find this number implausibly small, but it's what is claimed.

Footnote 3: John Atkinson discussed this subject in his 2011 Richard C. Heyser Memorial Lecture to the Audio Engineering Society.

Footnote 4: It's not only reviewers of course; anyone with good hearing and the right mindset can acquire such skill.

Footnote 5: Later, there was also a brief "slow listening" movement, a response to MP3 and earbuds.

jimtavegia
December 18, 2019 - 11:05am

I think our ability to concentrate is what causes all the problems. We do it in audio and in normal, everyday life.

How often do we fail to hear key ideas in a conversation only to have to have someone repeat themselves? We really didn't start truly listening at the beginning of the conversation. How easily distracted we are. Did we make a prejudgement about the speaker's value?

The fact that when reviewing loudspeakers we talk about how "revealing" one model is over another. Mastering Engineers must truly be able to concentrate for long periods of time, and then be able to discern whether that tweek to a music file is better or worse and quickly hit "undo".

I think that true audiophiles really just listen and do not do anything else on occasion when really making an effort to listen "deep" in the music. I also think that not only does one have to make a commitment to listen "fully", but to also try and make a value judgement, often impossible to do, on whether this recording is better than some other one.

Now at 72 I do not trust my own hearing and leave the mastering to others to judge, My HF is gone and for me to think that I can make an honest value judgement would be wrong. I am just glad that I am honest with myself. If someone says that speaker system or pair of headphones is bright sounding, it probably would not seem so to me.

What I find interesting in the writers at Stereophile is that all of their systems are so totally different, and yet they all have made choices on what they think is "best?" or is it just what they prefer? This is truly an odd hobby and often frustrating at times.

John Atkinson
December 18, 2019 - 12:02pm
Jim Austin wrote:
I find this number [the range of 40–50 bits/s] implausibly small, but it's what is claimed.

The ear-brain dramatically reduces the rate of information that reaches it during the act of perception. Consider a monophonic 10kHz tone encoded with 24/192 PCM. The original bit rate is 24x192,000 = 4.608 million bits/s. However, when that 10kHz signal is "heard," a single group of hair cells in the inner ear fires - the effective rate of transmission is reduced to just 1 bit of information as the cells go from off to on.

John Atkinson
John Atkinson
Technical Editor, Stereophile

Kal Rubinson
December 18, 2019 - 3:16pm

That on/off bit carries more information because it's just one channel of a highly paralleled input array. So, when "on" (1 bit of information?), it is signalling the presence of a 10kHz signal. When the adjacent group is "on," it signals and is identified as another frequency.

Only at low frequencies does the hair cell's signalling rate follow the actual external signal frequency.

dial
December 18, 2019 - 12:31pm

ABX is used in every science whenever possible. And it's only in high end audio it doesn't prove anything ?

Glotz
December 18, 2019 - 12:51pm

The complexity and the unfamiliarity (not only something new, but organic vs electronic) of a musical performance also has a great influence on the time it takes to absorb the various depths of its nature.

The issue was a huge one! The PS Audio Stellar Phono, The HW-40, the RAAL-Req SR1a, the Gryphon Ethos and of course, the Benchmark LA-4.

I imagine many readers are waiting to 'sing the joys' of this landmark preamp (or rather fawn over the measurements)! Tubes be damned; the next preamp I buy is probably this one. 'Straight wire with gain', indeed! Lemme guess, you're going to drop the review on Christmas? I bet JA1 has been Santa on Christmas at least once!

RH
December 18, 2019 - 12:53pm

On a purely anecdotal level, I personally have not found the need for long term listening to get the gist of an audio system.

I'm reminded of the 4 summers I spent as a Caricature artist at a local theme park in my University years. We did profile caricatures, and I was often enough doing up to 80 or more caricatures a day.

With experience, you begin to recognize certain facial templates, and then whatever deviation from that template sticks out. The speed at which I could process the nature of someone's face increased rapidly to the point where I could literally take a quick glance - seconds - (even someone's face passing in the crowd) and I could produce a pretty accurate drawing.

Though this seems to go against many audiophile assumptions I've found a similar experience in evaluating hi-fi systems. I seem to get a pretty accurate take on the sound of, say, a pair of speakers in a very short time, often a first impression, that remains constant when I'm able to spend much more time with a speaker in various set ups.

In my case when I audition a speaker I will experiment with speaker position, seating positions (far, medium, near-field), listening while standing, while below the mid/tweeters, off axis, even behind.
Once I'm done it seems I have a pretty reliable sense of what those speakers sound like and I can't remember ever being truly "surprised" by the sound of the same speakers when I either take them home, or when I encounter them in other set ups. They are either slightly better or slightly worse versions of the "voice" that was identified quite quickly.

Once I own the speaker, though, it's not like there are many revelations that occur listening over time. But rather, my own reaction to the sound can change to it, depending on mood, or whatever criteria I may be focusing on. If I listen to a friend's system I may come back and either more deeply appreciate some character in the sound of my speakers, or may notice the speaker seems a bit deficient compared to what I'd just heard. The speaker's voice doesn't change, but my attitude towards it may change over time.

Anyway...that's been my experience. I think that, yeah, living with a piece of gear for a long time, and experimenting with it, can in principle increase one's familiarity. But it also seems to me the significance of this can be somewhat exaggerated in the audiophile community, insofar as one can also get a largely accurate impression
in shorter periods of time.

Bogolu Haranath
December 18, 2019 - 1:28pm

Are those bell-bottom pants and platform shoes JA1 was wearing in that picture? (see, footnote 3, 'nothing is real') :-) .........

John Atkinson
December 18, 2019 - 1:36pm
Bogolu Haranath wrote:
Are those bell-bottom pants and platform shoes JA1 was wearing in that picture?

Platform shoes, yes, but baggies with French pleats rather than bell-bottoms. (I am on the left in the photo at www.stereophile.com/content/2011-richard-c-heyser-memorial-lecture-where-did-negative-frequencies-go-nothing-real.)

John Atkinson
John Atkinson
Technical Editor, Stereophile

