I take that back: It's beyond merely stupidit's preposterous, it's irrelevant to the field you purport to study, and it's insulting to both the participants in the test and those who wish to take from it any useful information. Distinguishing between an original work of art and a fakeevaluating differences that appear to be small yet that prove, over time, to be of considerable importanceis not a simple task. Notably, those are the same challenges confronted by the listener who wishes to distinguish the sound of one piece of playback gear from that of another: He, too, must examine two samples of the same work of art in an effort to determine which seems more real.
To judge the authenticity, the effectiveness, the legitimacy of any work of art requires three things: a participant who is familiar with the work(s) at hand; a test setting that's appropriate to the appreciation of art; and a generous amount of time for the participant to arrive, without stress, at his or her conclusions. The last requirement is the one with which blind testinga setting in which the participant enjoys the level of calm of a person being interrogated on suspicion of tax evasion or the distribution of child pornographyis least compatible.
In domestic audio, blind testing reduces the art of music to a thumbnail-sized snapshot that one is allowed to glimpse only briefly, and puts the listener on the spot: a state of heightened agitation in which it's made clear to the subject that his or her job is to please the test administrator (footnote 1). Blind testing of playback gear does little more than test the listenerfor the wrong thingwhile the perceptible qualities of the product under evaluation go unremarked: We learn not about the audibility of subtle differences but rather about the ability of the listener to resist being tripped up by the person in charge. When substituted for the simple efficacy of human perception, such a test is useful only to assuage the ungifted and the insecure, and to validate shoddy or cynical design and production practices among manufacturers.
Time and again, the use of blind testing in an attempt to quantify human perception has been discredited, most recently in the best-selling book Blink, by journalist Malcolm Gladwell (footnote 2). Far from a Luddite screed, Blink studies the making of snap decisions, which come about when the subconscious mind identifies patterns within brief experiential settingsthe author making the case that such cognitive "thin-slicing" can often be useful. Yet Gladwell warns against the making of decisions based on incomplete or irrelevant snippets, citing as an example the infamous Pepsi Challenge of the mid-1980s. In those blind tests, consumers were given small samples of two different colas and asked which they preferred; the majority of the time, the cola that tested best was Pepsithe other being the nation's No.1 seller, Coca-Cola.
The makers of Pepsi rushed to the airwaves a new advertising campaign to exploit those test resultswhich so unnerved their archrival that the makers of Coca-Cola decided to change their product's formula at once, and to push it in the direction of Pepsi. Because the major distinction between the two colas was Pepsi's greater sweetness, Coca-Cola was itself made sweeter. After a few rounds of pre-release blind testing and a bit more tinkering, the Coca-Cola Company brought New Coke to market.
It remains one of the most notable flops in marketing history. Within weeks, sales of New Coke, at first encouragingly strong, fell like a stone, and CCC was forced to bring back their old formula, thus correcting their mistake and reestablishing their primacy. (Interestingly, at no time during this debacle did sales of Pepsi surge ahead.)
What happened? The Pepsi Challenge was a "sip test." And, as former Pepsi executive Carol Dollard explains in Blink, "If you only test in a sip test, consumers will like the sweeter product. But when they have to drink a whole bottle or can, that sweetness can get really overpowering or cloying." As Gladwell puts it, "a sip is very different from sitting and drinking a whole beverage on your own. We have one reaction after taking a sip, and we have another reaction after drinking a whole can. Sometimes a sip tastes good and a whole bottle doesn't. That's why home-use tests give you the best information. The user is not in an artificial setting."
The Pepsi Challenge was based on sip tests. Blind comparisons of audio products are snippet tests. Those who design such tests assure us of their objectivity, especially when compared to something as abstract and inadequate and bugshit-crazy as the drinking of a soft drink or the auditioning of a hi-fi component. I would suggest that there is nothing more subjective than the belief in the relevance, to the human perception of a complex sensory event, of a one-minute laboratory test.
Gladwell: "Coke's problem is that the guys in white lab coats took over." Took over and got it wrong, I would add. Wrong on an epic scale.
Yes Sir, No Sir
We confidently assume that, in due time, the beverage industry ceased to require the services of those particular people in those particular white lab coats. That's how things work in the real world.
Unfortunately, audio is not the real world. In audio, even after the men in the white lab coats have been proven wrongas they were wrong about the lack of any need for sampling rates higher than 44.1kHz (footnote 3) as they were wrong about the lack of audible differences between competently designed amplifiers, as they were wrong about the obsolescence of vacuum tubes, and on and on, ad nauseamwe keep them on the payroll. We continue to buy recordings and playback gear from companies that pay greater heed to engineers than to people who actually listen: "Here's your reward for getting it wrong." Perhaps it's our warm-and-fuzzy emotionalism that keeps those blinkered objectivists coming back again and again: We foolish, insecure record-lovers wish, in our hearts, for Daddy-with-a-clipboard to tell us what we ought to and ought not to buyeven though, in our brains, we know how thoroughly, obstructively mistaken they can be.
Indeed, the objectivists' closed-mindednessthe boasts that their testing "proves" whenever a given material or manufacturing technique or design refinement makes no differencediminishes us all. Are we disappointed when our favorite analog recordings are remastered from 44.1kHz files rather than from the original master tapes, because someone convinced the company that "that doesn't make any difference"? Are we disappointed when an otherwise good electronics manufacturer lowers its manufacturing costs by switching from hand-wired circuits to PCB construction, because the company was persuaded that "that doesn't make any difference"? Are we disappointed when a manufacturer of classic loudspeakers begins making cabinets out of MDF instead of plywood, because an engineer convinced the company that "that doesn't make any difference"? Yes, of courseand, in every case, we have the most single-minded, hardheaded objectivists to thank for lowering quality across the board.
Footnote 1: In a famous test, conducted in 1963 by Yale psychology professor Dr. Stanley Milgram and described in his 1974 book Obedience to Authority: An Experimental View, the vast majority of participants were only too happy to inflict pain on others when ordered to do so by someone in a white lab coat.
Footnote 2: Back Bay Books/Little, Brown and Co., 2005.
Footnote 3: If you want a good laugh, go on the Internet and dig up Vol.26 No.4 (April 1978) of the Journal of the Audio Engineering Society, in which various engineers weigh in on the topic of sampling-rate standardization. Two things emerge: The righteous insistence that the world will never require a sampling rate higher than 44.1kHz, and the complete and utter lack of reference to actual listening.