|
Recent Additions
Budget Components Audacious Audio J. Gordon Holt
Loudspeakers
Amplification
Digital Sources
Analog Sources
Accessories Listening / Art Dudley The Fifth Element / John Marks Music in the Round / Kal Rubinson Fine Tunes / Jonathan Scull Special Features Reference Interviews Think Pieces Historical Recording of the Month Records 2 Die 4 Music/Recordings Stephen Mejias Robert Baird Fred Kaplan Wes Phillips Audio News Past eNewsletters RMAF 2009 SSI 2009 CES 2009 RMAF 2008 FSI 2008 CES 2008 RMAF 2007 CEDIA 2007 HE 2007 FSI 2007 CES 2007 China 2006 RMAF 2006 HFN 2006 CEDIA 2006 HE 2006 FSI 2006 CES 2006 Forums Galleries Vote Previous Votes AV Links Audiophile Societies Contact Us Customer Service New Subscription Digital Subscription Renew Give a Gift Sub Services Recordings Backissues More . . . Phono Preamp Hi-Fi Phono Cartridge Amplifiers Stereo Speakers |
The Highs & Lows of Double-Blind Testing:
A-B Testing As LA commented, I and others have done many experiments in which component differences were clearly audible despite any problems in experiment, system, room, or listeners as long as we pushed only buttons A or B effectively duplicating the subjective reviewer's non-blind procedure (meaning you always know what you're listening to). When we start pushing the button labeled X (footnote 1), which connects either component A or B (but only the comparator's microprocessor knows which), the choice suddenly seems more difficult. Lo and behold, our guesses prove inaccurate from three to seven times out of ten, in dicating that we couldn't tell the difference after all. I don't mean to suggest that the ABX test isn't sensitive. Under many circumstances, it does what was originally hoped; that is, demonstrates quickly and easily that a difference is audible. In an AES workshop last October we tested for the audibility of the Carver CD-fixing box, the so-called Digital Time Lens, using pink noise as a source, and 124 out of 124 responses were correct. But subtler characteristics may be harder to identify with the comparator, especially given the habitual rapid switching that the device seems to encourage. While it's true that it can be used for long-term blind testing, no one seems to have the patience. Yet another interpretation of the first story is that the anxiety produced by listening to the unknown decreases the sensitivity of the listeners. That anxiety can raise sensory thresholds is well-proven. These or other mechanisms may at any time, give a false negative result in a test for audibility. I can never disprove the existence of sonic characteristics that for some reason don't show up in a double-blind test. But some differences, including many that seem quite subtle, do show up in such trials. The distinction between the two kinds of characteristics is a useful one: I think those that do show up in double-blinds are more important, and more worth spending money on, than those that don't. Many people disagree; that's what keeps high-end audio alive.—E. Brad Meyer, Lincoln, MA Then, in Vol.9 No.2, Les Leventhal, of the University of Manitoba's Psychology Department, dropped a bomb into the pro-ABX waters by contributing an article based on his Audio Engineering Society paper, "How Conventional Statistical Analyses Can Prevent Finding Audible Differences In Listening Tests," Preprint 2275 (C-9), which had been presented at the 79th AES Convention in New York, October 1985: The Highs & Lows of Double-Blind Testing I do not know whether these "subtle differences" are real or imaginary. But I do know that many listening tests using the ABX comparator, including many published tests such as those in Audio cited by reader Huss, are conducted and analyzed in such a way that subtle differences actually heard by the listener will likely go unidentified by the experimenter when the data is analyzed. The problem with these listening studies is that the experimenters conducted too few trials (for example, 16), and used the .05 level of significance when subjecting the data to a statistical test of significance. Only in a large-trial listening study can the results be tested at a significance level as small as .05 without the risk of overlooking small differences becoming unacceptably high. To see why this is so, a little background in statistics (having nothing to do with audio) is necessary. Footnote 1: Readers who would like to know, in depth, what the ABX comparator does are referred to J. Gordon Holt's review and editorial in Vol.5 No.5. Briefly, the subject has the opportunity to choose either of two components to listen to, labeled A and B. There is a third button on the comparator, however, labeled X, which chooses A or B without the subject knowing which. The comparator keeps track of what was chosen. The subject, after choosing X a predetermined number of times (say, 10) and attempting to identify what X was, can then check the comparators memory to see how he or she did. It is still possible to) choose A or B to verify one's memory after commencing with X.—Larry Archibald
Article Continues: Page 3 »
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||


