Strawberry Fields: Auditory Objects & Bad Science

Getting a name check from the mainstream press can be a good thing. But as Wes Phillips wrote in his blog on February 5, "to paraphrase Mason Williams on winning an Emmy Award, 'It's like being kissed by a girl with bad breath—you appreciate the honor, but...'"

In his February 4 "Bad Science" online column for Guardian Unlimited, Ben Goldacre had ventured into the subject of blind testing as it applies to audio. He mused about "the widespread notion in the hi-fi community that blinded trials—where you ask listeners to identify a cable without knowing if it's cheap or expensive—are somehow intrinsically flawed."

In support of his thesis, he continued "I give you the editor of Stereophile, a respected hi-fi magazine of [44] years standing. He's talking about blinded tests on amplifiers: 'It seems,' [John Atkinson] says, 'that with such blind listening tests, all perceived subjective differences...fall away...when you have taken part in a number of these blind tests and experienced how two amplifiers you know from personal experience to sound extremely different can still fail to be identified under blind conditions....' Now I'm getting worried. Here comes the money shot. '...then perhaps an alternative hypothesis is called for: that the very procedure of a blind listening test can conceal small but real subjective differences.'

"Ouch...What voodoo is this?," Goldacre asked. "If there is a difference to be heard, then you will hear it."

You will hear it? I guess Mr. Goldacre hadn't read the rest of the 1989 essay from which he had quoted. I was writing about the listening tests I had organized at that year's Stereophile Show. I had taken two highly regarded amplifiers that were widely felt to sound different in normal listening, a solid-state Adcom GFA-555 and a pair of tubed VTL 300W monoblocks, and was trying to determine if they also sounded different in a blind test. The results were inconclusive, though a subsequent series of blind listening tests performed under optimum circumstances did result in statistically significant identification of the amplifiers.

So yes, Mr. Goldacre, if there is a difference to be heard, you will indeed hear it. But had you read the rest of my essay, you would have learned that designing a blind test that allows that difference to be detected is not as easy as you might think. As I have written before, for an experiment to be deemed "scientific," the variables under test need to be limited to just those which the experimenter is investigating. I don't regard it as "voodoo" to conjecture that the condition of a test's being blind might itself be an interfering variable.

This subject was debated in the fall of 2005 in our online forum. "DTH" had asked, on September 6, "Why is blind testing not such a hot button with: Drug testing in the pharmaceutical field? Wine testing? Perfume testing? Food testing? Where lies the difference(s) between these (and many other) areas where blind testing is common and noncontroversial, and the audio field?"

I had responded that "it comes down to the fact that in all those fields, what is being tested is the direct effect of the stimulus. With audio, you must test the stimulus indirectly, through its effect on music, which itself has a varying effect on the listener. (I think this is why blind tests of audio components are much more sensitive using test tones than music. But then, the fact that test tones are not music removes the test one step from reality.)"

With blind tests of wine, I explained, "the chemicals in the wine interact directly with receptors in the mouth and nose, and it is the effects of those interactions that are being tested for. With audio, the changes under test affect the behavior of the playback equipment on the reproduction of audio signals, and for stereo playback, we don't experience those signals directly, but only how they affect our interpretation of the music carried by those signals, which itself is an illusion."

This triggered much argument, and rereading the thread, I can see that I failed to get my point across. It's worth trying again: When you arrange for a blind test of, say, amplifiers, the listeners are not responding directly to the performance of the devices under test. There are two levels of abstraction involved between what the amplifiers are doing to the signal and what is perceived by the listeners. The first level of abstraction involves the brain's interpretation of what the soundwaves reaching the two ears are telling you about the world outside, a process that psychoacousticians call the creation of auditory objects. The second layer of abstraction is the brain's interpolation of meaning into the sum of those auditory objects; ie, are they random (noise) or ordered (music)?

The late audio philosopher Richard Heyser examined the efficiency of this process in a talk he gave in 1986 to the Audio Engineering Society. Any music student can churn out a piece of music that, when performed on a piano, recorded in stereo, and then played back, a human listener will recognize as being similar to something Chopin might have written. This character is preserved by the recording and by its playback. Yet, as familiar as the concept of "Chopin-ness" is, there is nothing in the measured performance of the individual hi-fi components in which "Chopin-ness" can be detected. "Chopin-ness" is a mental construct.

Moving back one layer of abstraction, someone listening to the recording of the Chopin-esque work perceives an image of the original piano between and behind the loudspeakers. But again, there is (almost) nothing in the measured performance of the individual hi-fi components in which this piano image can be found. Another mental construct.

The reality is that all the playback system is doing is producing two continually varying sound-pressure waves from two loudspeakers. Everything else is the result of massive signal processing taking place between the ears. Nothing is real, nothing to get hung about. It's all an illusion constructed on a foundation of illusions.

In normal listening, this happens unconsciously. When we listen to a stereo Chopin recording on our systems, we perceive an image of a piano on which a work of Chopin's is being played. But in the blind test, we have to do consciously what would otherwise be automatic. We have to start examining the character, the quality of individual auditory objects. We have to start consciously determining whether the sum of those objects has crossed the line between noise and music. In other words, we are no longer listening as we normally do. And if we are not listening normally, then the test itself becomes an interfering variable.

Sighted listening has its own pitfalls, of course, and no one has said otherwise. But Mr. Goldacre appears to be making the naïve assumption that the mere fact that a test is blind inherently—his word was intrinsically—confers legitimacy on the test and its results. That assumption, I suggest, is "bad science"—even voodoo.