The Highs & Lows of Double-Blind Testing Page 5

If an editor, in response to the above, tells us that steps have been taken to eliminate this prejudice or that bias from the reviewers, the editor will have missed the point. The point is that there are many commonalities among people in general and underground equipment reviewers in particular, some known and probably some unknown, that may produce similar errors in seemingly independent reviews. At best, an editor can take steps to eliminate or counteract the effects of only the known commonalities. However, the strength of the double-blind (or single-blind) method is that it eliminates the effects of not only known but unknown commonalities as well. For example, we may not know it, but most reviewers (1) may believe, incorrectly, that amplifiers with independent power supplies produce better imaging, and (2) may mistakenly "hear" better imaging in such amps. I am not saying that this is the case, only that it may be the case without our knowing it. If one does not know about this expectation, then one cannot eliminate its effects from nonblind reviews. But the double- or single-blind method will prevent the expectation from affecting the review, along with preventing a vast number of other possible problems about which we know nothing.

The double-blind method may have its own problems, however, ranging from listener impatience with large-trial studies to difficulties in obtaining adequate switching equipment. I would guess that the double-blind method will require a good deal of tinkering and refinement before it flowers into the blessing many believe it to be. And, perhaps, practical problems with the method may be too great to overcome. But I believe the method is worth a concerted, open-minded try.—Les Leventhal

The blind-testing establishment responded in the form of a letter from one of the inventors of the ABX box, David Clark, which appeared in Stereophile , Vol.9 No.5, August 1986, accompanied by responses by Les Leventhal, and John Atkinson.

The Double-Blind & the Not-So-Blind
Ladies and Gentlemen! Step right up! See the Academic take on the might of the Engineering Establishment and jump through hoops of real fire! See the Engineering Establishment use "real science" to support its entrenched position!

In Round One—"The Highs & Lows of Double-Blind Testing", (Stereophile, Vol.9 No.2)—Les Leventhal, of the University of Manitoba's Psychology Department, Winnipeg, Canada, put forward a hypothesis concerning the double-blind listening test widely used by "objective" equipment reviewers. He indicated that the statistics commonly used to analyze data from these tests had a built-in bias to produce null results (ie, real differences between amplifiers, say, would not be detected), unless the number of trials were increased significantly above that normally used.

As a reader of hi-fi magazines, you will undoubtedly be aware that such null results are widely used to promote the idea that "subjective" reviewers, who have found that amplifiers, for example, sound different to one another, are, at best, incompetent or, at worst, dishonest in their judgments. Who can forget the Stereo Review speaker cable test? Or the tests in England that showed that one Quad amplifier sounds pretty much like any other?

Well, Ladies and Gentlemen, here we present Round Two—and the first combatant to enter the arena is David Clark, of DLC Design, a widely respected engineer who played a major role in the development of the ABX comparator, a most useful piece of equipment for those investigating the subjective/objective frontiers.John Atkinson

David Clark opens with an attack on Leventhal's ideas
Les Leventhal's critique of the statistical analysis commonly used in blind subjective testing is misleading, erroneous, and borders on the incompetent. His letter is written in a style that prompts the casual reader to think "Someone has finally figured out what's wrong with all those blind tests where they don't hear anything." Not only has Leventhal failed to prove his case; he has demonstrated his own lack of understanding of how the audiophile benefits from double-blind testing.

Leventhal's first ploy is to state that we blind and double-blind testers are attempting to prove something which we, in fact, are not. He says we may erroneously conclude that no difference is audible in a particular test. He is mistaken. We never make that error (he calls it "Type 2 error"), because we never formally conclude that any difference is inaudible. We may make some informal statements of our opinions or we may issue a challenge to others to prove that they can hear the difference, but this is a far cry from making Leventhal's "Type 2 error."

X