Credibility Gap

To audiophiles who are aware that their household line voltage changes under varying loads, and have observed the absolutely fantastic differences in the sound of their system when the next-door neighbor turns on Junior's night light, it may come as a surprise to learn that there are folks out there who think you're full of crap. That's right, Virginia, they don't think you can really hear all those things you pretend to hear. (You are only pretending, aren't you?) They can't hear all those things, so how can you? Well, sometimes they can. They'll even admit that. But those tiny little differences are so trivial that they don't matter no more than a fruitfly's fart. That's the word in scientific circles these days. Or haven't you been following the "establishment" audio press lately?

Of course, we can hear all those little differences that none of those big measuring instruments can detect, can't we. Of course we can! And to us, anything we can hear is significant, not—we have to admit—because those things are all that conspicuously audible, but because we care about reproduced sound, and when you care, everything is significant. But perfectionist audio is in the doghouse these days. Not because we are hearing thing that can't be measured, but because we can't prove it. That doesn't much bother people who already embrace astrology or numerology or any of the world's bewildering selection of theisms, but it does bother scientists. And while critical listening to reproduced sound may be dismissed as an appreciation, like the finer points of art connoisseurship, subjectivity is of no value to the person who designs amplifiers or who relies on test equipment for evaluating them.

Unfortunately, subjective testing as a means of evaluation has thus far refused to lie down and shut up. There are differences between the sounds of different competing audio products, those differences are consistent, and they are not all reassuringly explainable by any known measurements. The fact that some people whose hearing seems normal in all respects are unable to hear those differences casts doubts on the veracity of those who claim they can. And audiophiles, for their part, make matters worse by imbuing their personal observations with those qualities of Ultimate Truth usually reserved for the most fundamental and time-tested laws of physical science.

One reason for this is because most audiophiles are insecure (and why not, considering how tenuous are some of the differences we hear?), and thus tend to hold on for dear life to anything that looks like a hitching post. Just one example of this, out of many we can think of without even trying, is the very compelling temptation to assume that every component in one's system except the one under scrutiny at a given moment is worthy of our confidence. This leads inevitably and with no conscious effort whatsoever to the conclusion that a new preamplifier, inserted in the program chain in place of the previous one, is better if it sounds better and worse if it sounds worse.

The fact that the new preamp may in fact have a hot-as-hell high end that better compensates for high-end dullness in one's cartridge, power amp, or loudspeakers than did the more-neutral preamp that was being used before does not discourage your average audiophile (nor indeed your average "subjective equipment tester") from making Olympian pronouncements about the relative merits and demerits of that new preamp with a confidence one might expect from a Bureau of Standards assessment of the accuracy of a butcher's scale.

The audiophile's observations may be valid; it is just that he lacks the justification for assuming that what he observes has enough significance to be of any value to any other living person on this planet. Unfortunately—and probably because critical listeners still have little in the way of measurements to back up their observations—a great deal of that kind of completely meaningless observation has been adopted by the audiophile community as unquestioned doctrine, to be trotted out and belabored every time someone has the audacity to suggest that our precious audio field could benefit by a little more science and a little less subjectivism. But like it or not, those critics of "subjective testing" have a point, and it is a point we have to face up to sooner or later. Subjective testing is unscientific, because no one has proven to any scientist's satisfaction that the observations it is based on are related to the real world of actuality.

Mind, you the question here is not whether or not it is possible to assess the performance quality of a component by listening to it. It goes much deeper than that. The point in question is whether we are actually hearing differences that exist, or whether we are in fact psyching ourselves to hear differences where there are no differences. To the scientific investigator, this means not only proving that differences are being heard, but also that there are objective reasons for those differences—that is, reasons which reflect in measured differences. We have not yet made it to Square 1!

The "establishment" audio press—High Fidelity, Stereo Review, and their peripheral publications such as Popular Electronics—has recently abandoned its former stance of open-mindedness on such matters and has mounted a campaign to discredit all "observations" which cannot be correlated with universally accepted measurements. I can only speculate on their motives for this—whether they see it as an aid and comfort to most of their advertisers, whether it is a means of bolstering reader confidence in their equipment reports (a confidence which has been getting shakier in recent years) or whether they are convinced that no one can hear anything they can't hear is open to question, but no answers can be forthcoming. Like the policy decisions of the White House, big-business policies are made and executed without incurring any feeling for the need of explanation or justification (footnote 1).

It seems quite clear though, given the generally high level of intelligence among the people who edit those magazines, that the reasons for their recent editorial thrust must have been business-type reasons rather than scientific-type reasons, for when large numbers of people experience certain subjective reactions to something—whether these reactions are based on legitimate external simulations of the sensory system or are purely imaginary—it does not serve science any better to dismiss those observations out of hand than to question the observations because they are made without "proper scientific controls."

When these audio writers tell us that they can measure anything we can hear, what they are really saying (although none has yet had the temerity to admit this) is that all the meaningful measurements of audio equipment have been devised, therefore any measurements which have not as yet been devised are meaningless. This sounds familiar. Several hundred years ago, the world's scientific community—such as it was—declared that everything that man could learn about the universe had already been learned. Not surprisingly, this statement was followed by a couple of hundred years of intellectual stagnation which we now, disparagingly, call "The Dark Ages." It would appear that one of the things those wise scientists had not learned is that people never learn.

Science
The term "scientific controls" is in fact one which crops up with ever-increasing frequency in criticisms of "subjective testing." For those not versed in the weird and wonderful ways of scientific method, this means simply that you can't measure a breeze in a hurricane. Another way of putting it is that, in order to determine that A affects B in a certain way, you must rule out the possibility that something else rather than A is affecting B. One does this by "controlling" the other factors so that the only thing that can affect B is factor A. If B then responds, A is the cause.

For example: Suppose a certain chemical changes from clear to blue when mixed with sodium chloride (table salt, to you) and vigorously shaken in a copper container as the shaker gets struck by lightning. To what do we attribute the change in color? Was it the combining of that chemical with salt, their mutual reaction to the copper of the container, the vigorous shaking, the heat or electrical energy of the lightning strike, or a combination of two or more of these things?

The audiophile approach to this question would be to meditate for a while on the possibilities, latch onto a possibility, and decide that that must indeed be the answer. The scientific approach would be to repeat the experiment with "controls." This simply means repeating the experiment, as often as necessary, with all but one of the suspected contributing factors eliminated as factors. To start, we would use a chemically inert container, mix the chemical and the salt without shaking them, and put the lightning on Hold. If there was no color change, we would then shake up the chemicals. If nothing happened we would then add a small amount of copper to the mixture. In other words, each of those original factors would be tried, two at a time, and then three or more at a time, in every combination and permutation, until the color-change occurred. The assertion, then, that such-and-such will cause so-and-so to happen would be considered a scientifically valid conclusion to draw from the experiment.

In audio, the difficulty of conducting such experiments arises from the fact that—despite the establishment view—there are no ways of measuring many of the things we hear, and subjective observations alone have no scientific respectability unless backed up by numbers—that is, by statistics. To say "I can hear such-and-such" will elicit from a scientific mind no more than a noncommittal grunt. You might just as well say you have some unidentifiable tune going through your head. But say that 247 out of 300 listeners were able to tell, without knowing A from B, that they were hearing B 78% per cent of the time, and that scientific mind will begin to take you seriously. Now you can cite figures, from a controlled experiment. Even Julian Hirsch would have to pay attention to that, even if he also had to report that he was one of that 17.7% who could only tell B from A 77% of the time. But no one has ever tried to conduct such an experiment—at least, not to the satisfaction of those who make the most claims for their incredible hearing acuity.

Such experiments have been attempted, but somehow the experimenters have always managed to use the wrong associated components ("Those speakers wouldn't tell them anything!") or the wrong scientific controls ("Sure, it was double-blind, but the switch contacts change the sound more than the preamps do") or the wrong listeners ("Why didn't they get Harry Pearson or Gordon Holt or Peter Moncrieff on the panel?"). I can't speak for HP or PM, but I would be glad to participate in such an experiment. I haven't been asked. Neither, I suspect, have they.

Layers of Egg?
Frankly, I'm not at all sure how I'd come through on such a scientifically incontrovertible listening test. Most such tests rely on A/B comparisons, and my own experience with those has been that I am one of the first to get hopelessly confused and start hearing 12kHz ringing from the speaker that has no response beyond 9kHz. All of us true golden ears maintain that it takes prolonged listening, not just a series of A/B switchings, to hear small qualitative differences with any degree of consistency, but I'm not sure how a test might be devised that would meet that criterion. Meanwhile, there is the real possibility of my making a complete ass of myself, by scoring lower on the Pick-Preamplifier-B test than Joe Blow who is deaf in one ear but likes to play the fiddle in his spare time. Imagine what the establishment audio press would do with that!

Perhaps Messrs. Pearson and Moncrief have had the same awful thought. Maybe one of us would have a bad day. Perhaps a touch of sinus trouble, with ringing in the ears. Imagine Stereo Review reporting that JGH or HP or PM was the only underground-magazine publisher who scored lower on Preamp B than the "norm" of tin- eared untutored listeners. What would that do for our much-touted credibility? Could it be that that's why none of those of us who rely on our ears for equipment evaluations have suggested such a conclusive (?) listening-panel test? Could be.

It is very possible that a person's ability to hear small differences is related, not just to the quality of the listening system, but to his familiarity with it. If this were the case, any listener, no matter how golden-eared, is at a disadvantage listening to any system but his own. But if he agrees in advance that the system used for listening tests meets his standards, and then then racks up a perception score he's ashamed of, no amount of protestation will save the red face under those layers of egg.

Credibility
Yet the question must be resolved if the perfectionist-audio field is to advance much farther than its present semi-stagnant state of stasis. We all know that there are indeed audible differences between components that appear to measure identically. We also know that there are people to whom those differences are, truly, insignificant. But there are many audiophiles who honestly can't hear what we hear, as well as multitudes of music-lovers who can, and whose record-players could be bringing them far more enjoyment if the magazines they read could admit that the differences they hear are real and not imagined.

Initially, high fidelity held out great promise to lovers of music, as a means for bringing the sound of great orchestras and opera companies into their homes. That promise has never been fulfilled, partly because of the outrageous prices that "perfectionist" components have been aspiring to, and partly (more recently) because of the rotten image we audiophiles have built ourselves in the eyes of the musical—as a bunch of wild-eyed fanatics who drool over technically virtuosic recordings of music and performances that are not worth the vinyl they're pressed on. Now the establishment hi-fi press is pressing home its advantage by assuring the people who could best appreciate good sound that mediocrity is good enough, true high fidelity is a waste of money, and perfection is here at last whether you like the sound of it or not.

The time has come when we self-styled golden ears must put up or shut up. What is needed, right now, is a listening test that will prove (or disprove, if it works out that way) to the scientific community at large that trained ears can hear things we cannot as yet measure, or are not as yet correctly interpreting our measurements of.

Whether or not those things are significant is irrelevant at this point. Without hard evidence of their audibility, not even their significance has any significance. Stereophile has neither the funds nor the time (nor, for that matter, the inclination) to organize such an experiment. But I have a lot of ideas to contribute on the subject of setting it up, choosing the listeners and so on, and—at the dire risk of my credibility—I'll be happy to serve as a listening panelist.

This could be the most significant thing in perfectionist audio since the invention of listening fatigue. Who wants to earn the credit for doing it? Gentlemen (and ladies), the gauntlet is flung. Do I see someone coming forward to pick it up?—J. Gordon Holt



Footnote 1: I am reminded of a discussion about editorial restrictions that I had a few years ago with the technical director of one of the above-mentioned establishment magazines, during which he said "Nobody tells me what to write. I write what I want to write." We were interrupted by more-pressing business before I had a chance to point out to him that he would never have been appointed technical director if he had shown any inclination to write things the publisher didn't want to see in his magazine. In business circles, that is called "Placing the right man in the right job."—J. Gordon Holt

COMMENTS
remlab's picture

JGH was the man! Just outstanding!

Allen Fant's picture

Yes, JGH is the Man!

je00143's picture

Yeah, well, call me a skeptic. How many readers out there still take a green Magic Marker and rub it along the edge of a CD? How can a ten-foot extension cord connected to hundreds of miles of electrical wiring make a change in the sound of an amp? How can a unidirectional connector cord make a difference with alternating current? Might be time to reread The Emperor's New Clothes.....or P.T. Barnum.

remlab's picture

Ha!

Catch22's picture

While I have no aversion to exploring the whole topic, especially when somebody else is going to all the trouble, I've never really cared whether anyone else believes me or not when it comes to what I'm hearing. As Jefferson might have said, "It neither picks my pocket nor breaks my leg."

Most audiophiles could care less about how others experience sound or what they do or don't think makes a difference.

barw41's picture

One doesn't hear differences simply because their ears have not been trained to hear them. I have had experiences trying to impress people to hear a definite 'wide' soundstage, and they look at me as if I'm nuts! They hear nothing.

fluffy's picture

Congrats - It's rare these days to see anyone still refer to embracing the scientific method as ending a stance of "open-mindedness". Such people are actually ending close-mindedness, but that really bad for high end audio. We need to circle the wagons to defend being close-minded, without actually stating that so clearly. We know how to do that well- we've been doing it for decades.

The only thing worse than selling snake oil and denying science is trying to bolster snake oil with a campaign of embracing the scientific method. It won't work, because it'll reveal uncomfortable truths.

There is no shame in selling snake oil - many have made fortunes doing so, which justifies it as a valid approach. And if anyone claims this or any industry isn't all about the money, then I'll show you an industry soon to end.

The current high end market is not robust, but we find enough people vulnerable to our sales pitches to muddle through. But if we embrace science, we risk what little we have. Even if we cleverly set up a rigged test that fools a bunch of hand-picked "scientists", next thing you'll know is that someone knowledgeable will demand "repeatability". And that's a hurdle too high for the dwindling high end market to bear. We aren't going to "out-science", "out-test", or "out-debate" the scientific community or those that believe and understand science. And sadly, the size of those communities grows by the day.

I firmly support sticking to the tried-and-true luxury item and mystery approach of high end audio, while throwing in the random pretty graph to convince those wavering that we're all "sciency" when convenient. We thrive on an air of exclusivity, snobbery and magic, and leaving those core competencies behind will end us.

saladina's picture

Claims by humans are usually subjective and must be validated with scientific testing. Double blind testing is easy and inexpensive.
Until/unless claims about audible effects of, in this example, power line fluctuations can be distinguished in a double blind test, they will be seen as unreal by most people.

ChrisS's picture

If "Double blind testing is easy and inexpensive", then who does them?

saladina's picture

Testing audiophile claims and myths

http://www.head-fi.org/t/486598/testing-audiophile-claims-and-myths

fluffy's picture

Who does them? People and firms that profit from truth and scientific validity, or high end audio firms trying to end themselves.

Did Houdini rely on double blind tests or the scientific method to validate his methods? Of course not. High end audio is forever tied to maintaining its mystery and using all forms of misdirection and deception to achieve that. All completely legal if we use enough vacuous words in marketing as has been done for decades and comically rewarded.

So if legal, then morally right, especially if it makes money. So let's stick with the script and end all calls for a truly scientific approach to high end audio (while of course claiming to be all about science in ambiguous ways when advertising).

The key is ambiguity - high end audio pioneers (along with other industries wth similar credibility issues) invested heavily in lobbyists and government officials to get a pass on the laws of the land if marketing was sufficiently vague. The moment we start getting all scientific, then those protections evaporate.

I'm currently listening to very expensive audio equipment that I claim is "seductively foxy". Prove me wrong.

corrective_unconscious's picture

"Double blind testing is easy and inexpensive."

You must be very competent and have a munificent concept of what constitutes inexpensive, because I regard envisioning and carrying out a valid, double blind test to be a most challenging effort, particularly when discussing rather minute, perceptual differences.

Revel makes a big deal of their (single or double?) blind loudspeaker testing. It involves bringing in a listening panel and an industrial, motorized rig to fling speakers about in the dark quickly. Think of the logistical and personnel costs to get just one trial, never mind the capital outlay for the design and hardware of the switcher.

(A speaker selector switch with the speakers sitting next to one another does not seem likely to be "valid" for me, and speaker differences are far less subtle than other equipment differences you could imagine trying to blind test for.)

John Atkinson's picture
saladina wrote:
Double blind testing is easy and inexpensive.

Not if you want to produce results that are reliable and repeatable. You can find a set of guidelines on how to design valid blind tests, ITU- BS.1116-2, at http://www.itu.int/rec/R-REC-BS.1116-2-201406-I/en.

If you compare these ITU-recommended testing guidelines with the methodology of the blind tests you list in another posting - www.head-fi.org/t/486598/testing-audiophile-claims-and-myths - the results of which have been proclaimed as "proving" that cables/digital formats/etc etc sound the same, you will note that none of them conform to the recommended guidelines.

I have taken part in or organized more than 100 blind tests in the past 37 years. I think it fair to ask you how many blind tests you have been involved with, to be so sure of your position?

John Atkinson
Editor, Stereophile

ChrisS's picture

Mostly bad science with inconclusive results. Too many uncontrolled variables, no randomization, sample sizes too small for any statistically significant results, are the usual flaws.

John Atkinson has explained his views and experience with DBT-

http://www.stereophile.com/content/listening-143#comments-link

remlab's picture

If you can't hear what someone else hears, good! Save yourself some money and by an old boom box at the goodwill. It's all good.

QSYSOPR's picture

The truth is a very subjetive thing as long as it is not supported by rigorous, science-based criteria test.One of those is abx-testing, which is denied by too many people in the audio industry and hifi market - unfortunately.

John Atkinson's picture
QSYSOPR wrote:
abx-testing . . . is denied by too many people in the audio industry and hifi market - unfortunately.

Do you use ABX testing when deciding what component to purchase?

John Atkinson
Editor, Stereophile

QSYSOPR's picture

... of course I do so. And when the dealer does not want me to do it my way, then I'm in the wrong store. This is the way I do it. I do not want to persuade others to do it the same way. But please believe me, I saved a lot of money the way I did it. To be honest, I first look at the readings of the component, because I think good readings do not contradict a good reproduction. I barely listen to components whose readings are inferior. To give you an example: the last component that I bought was the Oppo BDP-105D. I needed a good blu-ray player and went to a shop to buy one. I compared it to Marantz, Electrocompaniet, Denon and Arcam. In every aspect of the reproduction of video the Oppo was superior. As I found the sound of the Oppo also very good. I compared it to some high-priced CD-Players through my DAC. Although there was some difference between these players I was not able to give the answer, which one was the best. They were so close together. So I was able to beat to birds with one clap and bought the Oppo. All testing was done in my home because it is the environment that I know.

corrective_unconscious's picture

"And when the dealer does not want me to do it my way, then I'm in the wrong store. This is the way I do it....To give you an example: the last component that I bought was the Oppo BDP-105D. I needed a good blu-ray player and went to a shop to buy one. I compared it to Marantz, Electrocompaniet, Denon and Arcam."

I'd love to know the store which sells all those brands of disc players and uses an abx panel to switch amongst them. That doesn't make sense, even for Britain. What dealer is this you say you are referring to?

I can only imagine what a sound room with that many blu ray players on display would look like...given the apparent number of product lines and all the other stereo components needed to make a system.

Also, any garden variety abx panel is going to have some effect itself on any differences in the gear. They're not high end. They're adequate to switch signals in and out of the path.

Also, what is a component's "readings." Is that uk for "specs"? Or is it your own private nomenclature?

QSYSOPR's picture

... I'm German. Of course you are right, what I am doing is not an abx-test in the strict meaning of the word. I compare two components without knowing, which one is playing. This can be easily done with any tv for blu-ray playback and with an appropriate dac for music-playback just by switching the input. So no abx-panel is neccessary for this kind of test. And please do not tell me, that two different hdmi inputs on the tv-set or two different coax-inputs of the dac make any difference. If any abx-panel is having effect on the gear connected to it, then all other components do have the same effect, may it be a ty-set or dac or amp or what ever you want. This really does not bother me. If the effect is dramatical anything has gone wrong and if the effect is so mall, that one could hardly notice it - so what. At the end you have to decide if this "effect" comes close to your taste or not, will say, at last you decide which component plays best, which one suits your taste. Your presumption is correct: "readings" in the meaning of measurements or specs.

corrective_unconscious's picture

I was suggesting skepticism that any dealer would have the array of components you describe on display, and that in any such environment claimed listening tests - via abx switcher or via five different HDMI inputs on a TV - would be meaningful.

QSYSOPR's picture

Well, I cannot speak for every city in Germany. My hometown has at least two dealers with a variaty of different high-end gear. Maybe I'm a lucky one ;-)

ChrisS's picture

You do what most of us do.... but that's not ABX testing. That's just "Comparison Shopping".

David Harper's picture

The Audio Engineering society of America recently conducted extensive year-long double-blind testing of SACD and concluded;
"there is no audible difference between CD and SACD".
The reason there is so much silly subjectivity in Hi-Fi is precisely because these judgements are,ultimately, impossible to prove or disprove. It's like people who claim God is in control of there lives.Nothing you're going to say will change their mind. Facts have no meaning to people like this. I recently read a review of a CD player in Stereophile. The reviewer went on and on about how this player "opened up a heretofore unrevealed soundstage" Huh???? I think all CDs have the same technical specs. How does a binary numberstream feature a mysterious sonic characteristic unrevealed by previous players?

John Atkinson's picture
David Harper wrote:
The Audio Engineering society of America recently conducted extensive year-long double-blind testing of SACD and concluded; "there is no audible difference between CD and SACD".

Not that I am aware of. (I am an AES Member.) There was a deeply flawed paper published in the AES Journal in 2007, authored by David Moran and E. Brad Meyer: http://www.aes.org/e-lib/browse.cfm?elib=14195. A more rigorous double-blind test, with optimal protocols, discussed in a paper presented to the AES this past October, gave the opposite result, with respect to hi-rez PCM: http://www.aes.org/e-lib/browse.cfm?elib=17497.

John Atkinson
Editor, Stereophile

David Harper's picture

John, I'm sure you're right. I forgot where I read that. I have compared SACD with CD in my listening room, and I don't hear a significant difference.(admittedly, this is a pretty lame "test"). I have also listened to Blu-ray pure audio, and I have heard a significant improvement over CD. It seems to me that this is the superior format. What do you think?

AJ's picture
Quote:

There was a deeply flawed paper published in the AES Journal in 2007, authored by David Moran and E. Brad Meyer: http://www.aes.org/e-lib/browse.cfm?elib=14195.

"Deeply flawed" according to whom?
I'm an AES member as well, don't recall any such verdict :-).

Quote:

A more rigorous double-blind test, with optimal protocols, discussed in a paper presented to the AES this past October, gave the opposite result, with respect to hi-rez PCM: http://www.aes.org/e-lib/browse.cfm?elib=17497.

One mans "rigorous" is another mans "rigged".

But then again, these are both honesty controls/blind tests, hopefully you/Stereophile dismiss them equally.

cheers,

AJ

augmaticdisport's picture

"Not because we are hearing thing that can't be measured, but because we can't prove it"

Subjective testing results can be validated (proved) by a correlation with measurements.

Subjective evaluation of loudspeakers has been shown to strongly correlate with almost every measurable parameter of loudspeaker performance.

On the other hand, subjective evaluation shows no correlation with digital audio resolution above 16/44. However, this may change as more studies are performed.

"But say that 247 out of 300 listeners were able to tell, without knowing A from B, that they were hearing B 78% per cent of the time, and that scientific mind will begin to take you seriously."

Because without a statistically significant result, there is no way that a hypothesis can be shown to be true.
I could complain about a lack of high frequency detail, but this is no fault of any component in the system, but my own hearing damage. It would be ridiculous to take my perception as a genuine criticism of the system.

The entirety of modern civilisation is built on the principles of science (proving hypotheses through measurement and repeatability of results, including subjective testing). Your car, your computer, the internet, medicine, the entire modern WORLD relies on this.

High end audio is part of the same reality and is no exception.

John Atkinson's picture
augmaticdisport wrote:
On the other hand, subjective evaluation shows no correlation with digital audio resolution above 16/44. However, this may change as more studies are performed.

It has changed. A paper presented at the AES Convention in Los Angeles last October concerned the results of rigorous blind testing showing that the reduction in bandwidth of a hi-rez recording to 44.1kHz was audible. See www.aes.org/e-lib/browse.cfm?elib=17497.

John Atkinson
Editor, Stereophile

X