The Blind Leading the Blind?

The first epiphany I experienced in blind audio testing took place in the Dunfey San Mateo Hotel, in Northern California. We were stuffed into a largish, well-lit room in which dozens of listeners sat in chairs, and others stood around the back or sat on the floor. Up front were two large B&W Matrix 801 speakers on tall stands spaced far apart, behind them, opaque curtains hid a small pile of audio equipment. John Atkinson and Will Hammond stood at stage left.

The occasion was the 1989 High End Hi-Fi Show (many audiophiles still call it the "Stereophile Show," even though it has since officially morphed into the Home Entertainment Show), and JA was conducting a series of public blind listening tests in which an Adcom GFA-555 stereo amplifier ($750) was pitted against a pair of VTL MB300 tube monoblocks ($4900/pair).

You can read all about the testing procedure and results online (footnote 1), including a letter to the editor I wrote at the time. As JA discussed the results immediately following our listening test that day 16 years ago, what struck me was this: It wasn't the amps that were actually under test, it was us. The implication was obvious: The only thing a blind audio test can really do is ascertain the listening acuity of those taking the test.

Advocates of blind listening tests who say that ABX or similar tests prove anything about audio components or can be useful in Stereophile product reviews have it upside down and backward. They're living in negativeland. The fact that any two of us in the same room taking the same test end up scoring different results is all you need to know about the nature of blind audio testing.

Human beings are not pieces of objective test equipment. Some listeners consistently score above average, others bounce around the middle, and still others never get it right. My own blind-test results are not consistent. And all the while, the differences that do or do not exist among products remain fixed—and likely undiscovered.

Perhaps, then, under ideal conditions (whatever those might be) one could suggest that blind tests might—just might—help sort the rabbit ears from the cloth ears. (By the way, here's what I think disgruntled advocates of blind testing secretly hope to achieve by demanding that these tricky tests be part of our reviews: Not to reveal differences among audio components, but to humiliate as many unsuspecting audio reviewers as possible.)

Also consider that someone with an axe to grind can always use a room full of cloth ears or a poorly designed test to "prove" that no meaningful differences exist between two components. Another series of tests, with a single properly trained listener who consistently beats the odds, can prove the opposite. This problem has been repeatedly documented. The question to ask of a blind test is not "Are there really any differences between these two components?" but "Can anyone here detect anything under these particular conditions?"

Looking on the bright side of blind testing, how about using the procedure for its entertainment value? Using blind tests, can the world's most sensitive, most critical listener be found, or the ultimate listening panel created? A blind test could be developed in which a known distortion is added to one of two components; then we can see who can hear it. We could then repeat the test with less and less distortion, until no one scores. Wouldn't it be fun to sort out contestants' listening abilities at the next Home Entertainment show, with big prizes and audiophile fame for the winner? Audio legends deposed, relative unknowns rising from the ranks—and the first Stereophile reality show is born.

In the end, the only testing combination of ears and brain I trust for determining, for better or worse, what sounds different and what doesn't, are my ears and brain. You, of course, should trust and continue to train your own ears and brain. And there may be a few trained reviewers on whom you've come to depend—as JA himself has often written, "JGH (in whose ears we trust)" (footnote 2). Keep in mind that one person's crucial difference is another's trivial and/or hard-to-hear detail; over time, you'll come to have a pretty good idea of just how much overlap there is of what you hear and what a given reviewer hears.

Just as some of us are inherently better runners, balalaika players, or tattoo artists, some of us have better listening abilities or potential. Such folks are often called (sometimes ironically) "Golden Ears." But I also believe that critical listening is a learned skill that, with practice and careful study of the factors contributing to sonic characteristics, can be improved.

Here's an example that made a difference for me years ago. I was in a recording studio, using digital sampling keyboards that allowed me to adjust the sample rate to manage my hard-disk space and resolution needs. Certain sounds were obviously being vandalized as I moved the sampling rate below 20kHz—a sputtery, zippering noise appeared that receded only when I pushed the sampling rate back up to 40–50kHz.

I spent hours, then days obsessing over this single variable and how it affected everything from Zildjian ride cymbals to the human voice to an Indonesian angklung (a tuned bamboo sliding rattle). Once I was able to home in on how that sampling artifact made various instruments sound, I began to hear it on commercial CDs I owned. Before, I hadn't heard it at all; after, I could reliably hear differences between CD players by listening for subtle artifacts that my brain was now trained to detect.

Give someone enough time to learn what a testing environment's aural variables sound like, and I'll bet they can beat the unprepared novice listener in a proper blind test every time. Regarding that amplifier test at the Dunfey San Mateo in 1989, it so happened that I then owned and was a dealer for Adcom's GFA-555 amp, and was familiar with its strengths and weaknesses. That day, another person and I tied for the highest score in the room: out of seven trials, six correct. The other high scorer was also an industry veteran.

As a result of this and dozens of listening tests since, whenever I come across a group of guys (it's always guys) going on about ABX or blind listening tests and how they prove/disprove that there are/aren't differences between components, cables, etc., I picture them all standing around a pile of apples, discussing whether or not their tests can determine which oranges are the good ones. They're arguing about the wrong thing, and it gets boring pretty quick.

Thanks, guys, for coming up with a wonderful way to figure out who has the best trained ears in a particular situation. Let's leave it at that.

Footnote 1: You'll want to check out this link, if for no other reason than to see a picture of John Atkinson as a cherubic, clean-shaven youngster with a big grin on his face, hunched over a table of test gear.

Footnote 2: JGH = J. Gordon Holt. Examples of JA's invocation can be found at and