An Amplifier Listening Test Page 5
Despite our presentation of what may seem a thoroughgoing pan of the Stereophile study, from the way it was designed and executed to the analysis of the data, we applaud the study as an honest attempt to get an answer to a vexing question.
Audiophiles and high-end journals have claimed to find significant audible differences between amplifiers that have no major measurable differences in electrical characteristics and that short-term listening tests with blind or double-blind controls often cannot discriminate. The blind controls are intended to ensure that only the sound of an amplifier, not its appearance, price, or reputation, will influence the listener's judgment, and when non-blind reviews purport to find differences between amplifiers with equivalent electrical measurements, the more hardheaded among us are prone to attribute the differences to the influence of the non-audible status factors normally unavailable in a blind test.
However, there are too many nagging suggestions of real amplifier differences to allow the hardheaded but open-minded observer to rest with this judgment. One of these is nicely put in John Atkinson's comments in the preface to the Stereophile report. Having been unable to distinguish an expensive amplifier from a moderately priced one in a blind test, he sold his own expensive amp and replaced it with a cheaper Quad 405. He later came to regret this decision as one of the worst he had made in audio.
It seems reasonable to conclude from this and many similar accounts that long-term listening to an amplifier will reveal characteristics and annoyances that are extremely difficult to detect in a brief session. With this conclusion we would venture the further suggestion that one attribute of a "golden ears" listener is the ability to be annoyed in a few minutes by amplifier traits that take weeks to annoy most people.
How, then, is one to show differences among amplifiers without giving listeners a month for each comparison? We consider the Stereophile test to be an honest attempt to find a reasonable way to do this. That study apparently assumed that the trick is to use a large sample of listeners—this being a common experimental approach to measuring small effects. Our study showed that a large sample is not necessary, only a clean design.
Footnote 2: This article was originally submitted as a letter to the editor. Because of its length, however, and the fact that the authors carried a considerable amount of additional work, I decided that it would best appear in its current form.—John Atkinson