Features

The Highs & Lows of Double-Blind Testing

Larry Archibald May 05, 1985

Editor's Note: In 1985 and 1986, an argumentative thread ran through Stereophile's pages, discussing the benefits or lack of double-blind testing methods in audio component reviewing, triggered by J. Gordon Holt's review of the ABX Comparator. As this debate is still raging nearly 15 years later, we present here the entire discussion that bounced back and forth between the magazine's "Letters" section and features articles. It was kicked off by a letter from C.J. Huss that appeared in Vol.8 No.5.—John Atkinson

ABX...
Editor: First off, thanks for the enjoyable reading your magazine provides. While I do not agree with everything printed therein, as your product reviews seem too often free of the kinds of proper controls that avoid psychological prejudice, the liveliness of the writing makes up for this. Your willingness to entertain differing points of view is refreshing, and contrasts with the editorial rigidity of some other magazines, both mainstream and underground. But I have a question about your testing approach.

In the April 1985 issue of Audio, Laurence Greenhill and David Clark reported the results of some tests performed on a McIntosh MC2 002 power amp and some Sansui top-of-the-line separates. In addition to the technical evaluations, a subjective listening test was performed for each of these products, then compared to the results when an ABX comparator unit was used to achieve double-blind testing conditions. Mr. Greenhill was able to correctly identify the unit under test 10 out of 16 times when he Mac was compared with his I presume reference units, 16 out of 16 times when testing the Sansul preamp, and 9 out of 16 times for the Sansui power amp. Only one of these tests is statistically significant, but purely subjective evaluations of all these components were all very different when the unit under test was known.

Some issues back [in Vol.5 No.5], Mr. Holt recommended the use of the ABX comparator, stating that it would help take the fraud out of subjective listening tests. I agree, and applaud him for his stand. My question, then, is does Mr. Holt, or any other contributors to Stereophile, make use of this device during component testing? If not, why not?

I simply cannot accept the views of some that comparators such as this mask the small details that subjective reviewers claim to identify so readily. There are ways to prevent the problems of switch contacts causing audible problems due to resistance or contact rectification, and it seems, from descriptions I have read of the ABX unit, that it handles these issues with aplomb.

If the trouble was a lack of cash flow back in your scuffling days, I can understand, but surely, judging from Larry Archibald's pats-on-the-back in the "As We See It" column, the magazine is back on its feet by now. If it is still in the planning stage, let's get a move on before someone else gets the glory What say ye?—C.J. Huss, Lancaster, PA

Point by point: We never purchased an ABX comparator for several reasons. First, we have never felt the need for it. Second, we are finding that, regardless of "controls," an A/B test doesn't reveal small differences between components as well as does prolonged listening. (Yes, one can listen for prolonged periods to A, B, or X, but when one can obtain the same results without a comparator, who needs it? We find enough consistency in the independently-prepared reports of our various reviewers to pretty much rule out "prejudices" or selfdeception.)

We take our component testing very seriously so much so that we frequently go back and relisten to products reviewed months previously. We rarely find any reason to change or qualify the original report. (And when we do, it gets published as a "Followup."—J. Gordon Holt

I think JGH misses some of the points raised. The problem of the ABX comparator is not a trivial one. To date, subtle differences between products widely ac knowledged to sound different have not been corroborated in a double-blind A/B situation. This is true even with experienced listeners who, with the ABX tester hooked up and part of the system, can readily identify the different components as long as their identities are known. This ability to discriminate goes away, however, as soon as the components' identities are disguised. Why is this true?

As of now, we don't know. The two obvious possibilities: there is no difference between the components, and differences noted when identities are known are imaginary; or the A/B test somehow fails to reveal differences that do exist. Laurence Greenhill [a member of Stereophile's reviewing team since the mid '80s—Ed.], E. Brad Meyer, and Peter Mitchell are all in the thick of the ABX controversy, and we hope to hear from them on this matter.—Larry Archibald

Features

The Highs & Lows of Double-Blind Testing

ARTICLE CONTENTS

ArtIcle Contents