To the Simple, Everything Appears Simple

My spirits sank as I read the comments on Stereophile's Facebook page. In the November issue, we had published reviews of UpTone Audio's USB Regen device by Kalman Rubinson, Michael Lavorgna, and myself. Michael and Kal had enthused about the positive effect the USB Regen had made, but I could detect no measurable difference. On Facebook, Dan Madden had written, "I think a device like this would need a blind listening test to verify that a listener could hear the difference in a statistically measurable way, in a very high percentage of times."

I have no argument with that statement. But then, Madden went on to say, "Have someone hook up this gizmo on YOUR system, and then have you listen to it with the same song 10 times with and without it connected randomly, and if you get the 'better sound with it' right 9 times out of 10 then I would be convinced that it makes a difference to the sound."

Sounds like a simple test, but designing a blind test that can be used to confirm or deny that a real but small audible difference exists is far from simple. In the formal statistical analysis of the test results, you can't prove a negative; you can conclude only that, under the circumstances of the test, no difference could be detected. By contrast, a statistically significant positive identification can be regarded as universal proof that a difference is detectable. But that analysis depends on the test examining just one variable—the difference being examined—and, as I have repeatedly discussed in this magazine, the blind-testing methodology itself can be an interfering variable in the test. The fact that the listener is in a different state of mind in a blind test than he or she would be when listening to music becomes a factor.

Rigorous blind testing, if it is to produce valid results, thus becomes a lengthy and time-consuming affair using listeners who are experienced and comfortable with the test procedure. Otherwise, the results of the test become randomized, hence meaningless.

In the words of famed mastering engineer Bob Katz: "There is no such thing as a 'casual' blind test. Blind tests are a serious business. Experimenters need training how to perform blind tests well. Blind tests can fail (produce statistically invalid results) if the experimenter neglected one critical detail. Weeks of intensive study are required to learn how to perform blind tests. Then weeks of preparation to create the test. Then weeks of testing to follow."

Some probably think it paradoxical for the editor of a magazine based primarily on the concept of judging audio components by listening to them under sighted conditions to be commenting on blind-testing methodology. However, since the very first blind listening test I took part in, in 1977, organized by the late James Moir for Hi-Fi News magazine, I have been involved in well over 100 such tests, as listener, proctor, or organizer. My opinion on their efficacy and how difficult it is to get valid results and not false negatives—ie, reporting that no difference could be heard when a small but real audible difference exists—has been formed as the result of that experience.

There is, in fact, a formal discipline devoted to the design of blind tests, based on recommendations formulated by the International Telecommunications Union in its document ITU-R BS1116-3 (footnote 1). Katz was summarizing the ITU guidelines and their consequences; the context for his comments was a workshop at the 139th Audio Engineering Society Convention (footnote 2), held last October in New York, on the audibility of possible improvements in sound quality made by recording and playing back audio with bit depths greater than the CD's 16 and sample rates higher than the CD's 44.1kHz.

This is a contentious subject. On the Stereophile website forum last summer, reader David Harper wrote, "Humans do not hear any difference between 16-bit/44.1kHz and any higher bit/sampling rate. This is established fact."

Harper was referring to a 2007 paper by E. Brad Meyer and David R. Moran that "proved" that there was no sonic advantage to high-resolution audio formats (footnote 3). Their conclusion ran counter to the experience of many recording engineers, academics, and audiophiles, but other than doubts over their methodology and the fact that their source material was of unknown provenance, Meyer and Moran's paper seemed to be the final formal word on the matter.

Until now. The AES workshop in which Bob Katz was taking part also featured presentations by legendary recording engineer George Massenburg (now a Professor at McGill University, in Montreal) and binaural recording specialist Bob Schulein. But it was the first presentation—by Joshua Reiss, of Queen Mary University, in London, and a member of the AES Board of Governors—that caught my attention.

Some 80 papers have now been published on high-resolution audio, about half of which included blind tests. The results of those tests, however, have been mixed, which would seem to confirm Meyer and Moran's findings. However, around 20 of the published tests included sufficient experimental detail and data to allow Dr. Reiss to perform a meta-analysis—literally, an analysis of the analyses (footnote 4). Reiss showed that, although the individual tests had mixed results, the overall result was that trained listeners could distinguish between hi-rez recordings and their CD equivalents under blind conditions, and to a high degree of statistical significance.



Footnote 1: See "Methods for the Subjective Assessment of Small Impairments in Audio Systems."

Footnote 2: See "Perceptual Evaluation of High Resolution Audio."

Footnote 3: Meyer, E. Brad, and Moran, David R. "Audibility of a CD-standard A/DA/A Loop Inserted into High-Resolution Audio Playback," JAES, September 2007, AES E-library 55 (9) 775–779.

Footnote 4: Dr. Reiss reported that it didn't prove possible to include Meyer and Moran's data in the meta-analysis, because there were statistical inconsistencies. In particular, the number of listeners who scored positively was just 3 out of 55: far fewer than would be expected if the experimental data were truly random.

COMMENTS
Charles Hansen's picture

Well put. I have been conducting blind testing for over 30 years (not double blind). I have been baffled by the positive results from things that I "knew" couldn't possibly make any difference. I did everything in my power to deliberately trip up specific subjects - to no avail.

I think that some of the best writing on this subject recently came from Tyll Hertsens on the InnerFidelity.com website:

http://www.innerfidelity.com/content/big-sound-2015-roy-romaz-nails-it shows that inexplicable things can happen.

http://www.innerfidelity.com/content/testing-audibility-break-effects

and many more entries in his "Big Sound 2015" section. The links to the papers you noted are also excellent. Keep up the good work.

JUNO-106's picture

I've been enjoying high-resolution audio since 2004 via the SACD format and to my ears it is (almost)always better than the CD counterpart(the much cliched "veil" had been lifted) HOWEVER there have been quite a few times when I thought that the subtle improvements of a particular SACD recording didn't justify the extra cost.

It's been a mixed bag for sure!

It's nothing like the difference between standard definition TV and high definition TV. That is like night and day.

CD vs High resolution audio is more like early-evening and mid-afternoon.

I can kind of see how non-audiophile persons might have a hard time distinguishing between them.

But I am hooked because when all the stars align for a particular high resolution recording it can sound absolutely thrilling.

Music_Guy's picture

complicated.

Whether or not commenters are completely accurate about what constitutes a "blind" or "double blind" or "whatever" test, what they are saying is: "Does that test convince me A is better than B to the extent that A is worth my investment?" if the results are not obvious/convincing/tangible, who cares if the test was correctly constructed, it is not worth it.

We don't want differences we can't reliably/consistently perceive, we want differences we can hear every time.

As close to accurate as the State of the Art is today in the areas of recording and reproduction of music, there is still room for better sound. Those of us who read articles and listen critically to reproduced music and drop money on components non-audio people consider ridiculous want to know will this new component sound better that what we have now.

Components/systems that ARE different MUST sound different. Yes. But at some point, which varies by individual, the difference between two really good but sufficiently closely performing components will not be audible. Sorry. THAT is why those test results are not conclusive. With bigger differences, even "simple" tests demonstrate those differences. If the differences are so small that the tests reveal only "statistically significant" results, that is not good enough for me. I want simple results.

Keep both the editorial-style and measurement-style content coming. And call me "simple" if you'd like.

John Atkinson's picture
Music_Guy wrote:
We don't want differences we can't reliably/consistently perceive, we want differences we can hear every time.

The problem is that the former turns into the latter with experience. What initially goes unnoticed becomes an irritant once the listener has learned what it sounds like.

An example: decades ago I visited the then Acoustic Research factory in Boston. There was a woman on the line who listened for excessive drive-unit distortion by ear, sweeping each unit with a sinewave tone. Problem was that you couldn't keep the same person doing this job because over time, she would reject drive-units that had just 0.1% of distortion at some frequencies, less than the accepted level of low-order harmonics that was supposed to audible.

John Atkinson
Editor, Stereophile

24bitbob's picture

I concur with a lot that Music_Guy says, and in particular note that the differences to which he refers, are often associated with hefty differences in cost. But I am also aligned to the explanation in John's article where he emphasises that listening critically, requires training and a high level of competence. That level of competence, I suspect, is beyond most HiFi hobbyists.

Probably, for most people, most of the time, the subtle differences reported by professional reviewers will be beyond their abilities to detect any difference at all.

Caveat emptor.

dalethorn's picture

I don't think training is necessary unless you're reviewing gear or designing gear etc. If the user has good ears and tends to be picky about good sound, the irritant factors will kick in eventually. Hopefully when that happens, the user will do some research before committing more cash to upgrades.

Ajani's picture

I'm of the opinion that any difference that you have to be trained to hear is subtle. So the problem really comes down to these subtle differences being referred to as night and day, blatant, like a veil had been lifted etc etc etc by reviewers. Since an average person reading the review will expect sonic nirvana when they use the product and instead be thoroughly disappointed that they can't hear a difference.

I think the best way to address the issue of whether differences are in fact real and bring credibility to the hobby would be to have several DBT studies done using only trained listeners (reviewers and experienced audiophiles). Even better would be to repeat those tests with untrained listeners and compare the results.

lo fi's picture

He is the Director of Acoustic Research for Harman International and has been conducting controlled blind testing there for more than 20 years. He was also the President of the Audio Engineering Society. It would be interesting to have the perspective of a scientist who researches in this field.

dalethorn's picture

I assume Mr. Olive approved the sonic design of the AKG K812 - originally $1500. A really terrible headphone with huge peaks in the treble. Innerfidelity has the data.

lo fi's picture

The AKG K812 was designed and built by AKG in Austria. Although AKG is owned by Harman International, it is highly unlikely that Dr Olive would have had any involvement in the R&D for that product.

dalethorn's picture

That's a nice disclaimer, but doesn't wash. It's a $1500 headphone - high profile, and the top dog is preaching his version of sound Nirvana far and wide, while his highly visible top product is anything but. Whether he worked on the design is irrelevant - he shoulda had a listen, or listened to the critics.

lo fi's picture

Don't be ridiculous. AKG may continue to operate entirely independently of Harman International which acquired it - I really don't know. What I do know is that Dr Olive does not work for AKG; nor is he a headphone designer. He is the Director of Acoustic Research at Harman International.

dalethorn's picture

Olive preaches the message of the "Harman Response Curve". He is also responsible for the K812, being that it's their #1 headphone product. Saying that he's unaware of it and its bad reputation is denying the obvious.

lo fi's picture

The Harman Response Curve has been developed by Dr Olive and his acoustic research team. The K812 headphone was designed and manufactured by AKG in Austria. If you wish to conflate these two unrelated facts into misinformation for dissemination across the interwebs, well that's entirely up to you. Further, I did not say that Dr Olive is unaware of the K812 and its reputation - whatever that is, I said that it is highly unlikely that he would have been involved with its development. And to hold him personally responsible for it is just bizarre.

dalethorn's picture

You have a terrible problem connecting the dots and holding people responsible for what they do. It's called denial, simply enough. Mr. Olive isn't merely an audiophile, he's one of the kings of audiophilia. To suggest that he simply wasn't involved or aware of the K812 - Harman's flagship headphone, is absurd.

And oh, yes - throwing out the 'conspiracy' term is classic disinformation. Caught ya!!

lo fi's picture

You keep on connecting those dots and holding people responsible for what they do, while I impose a form of self-denial which entails ending my participation in this slightly disturbing exchange.

mvs4000's picture

Yes, blind testing is difficult and not without error. However, the adjectives that often get invoked by the audiophile press to describe some perceived difference imply that whatever it is the reviewer is hearing should rise, dramatically, above the "noise floor" of blind testing. It is supremely incongruous to claim that a cable switch out "tightened the bass and opened up the soundstage" and then claim that a double blind test is too insensitive to detect the difference. The rhetoric doesn't square with reality.

Ajani's picture

Yep. That's exactly the point I was trying to make earlier. A lot of the negativity towards this hobby and reviewers stems from them overemphasizing subtle differences. If reviewers claimed that X ultra expensive cable made a subtle improvement in the sound of their system, odds are that people would hardly care. The problem occurs when the reviewers claim it opened a door to a 3 dimensional world outside our space time continuum, where music and life merged into the symphony of the angels.

mrvco's picture

Mellifluous, hyperbolic blathering on sells apparently. I'd love to see a speaker cable or interconnect comparison where the reviewer did not have retail prices or any of the marketing bs from the manufacturer.

Ktracho's picture

is that even if 99.9% cannot hear any difference, it does NOT follow that the product should not be sold for X dollars. Each person should have the freedom to purchase the products he/she believes will enhance the listening experience without you or me telling him/her it is not possible to hear any difference in the sound.

Steve C's picture

In the medical industry there is the placebo effect. If you are using the original tapes and are recording Red Book and Hi-Res there should be no difference in the play back. If it was recorded in Hi-Res, played back in Hi-Res, then take the Hi-Res and record on Red Book you might hear a difference. Don't know, have not tried it, but sounds like a test that needs to be done.Take a Blu-Ray audio and record it to CD. Worth a try. I just listen to the artist and hope the recording is worth the money I paid for it.

dalethorn's picture

How would you like to be suffering from a possibly fatal disease and receive a placebo instead of a life-saving drug? Not a pretty thought. In hi-fi, how would you like it if the reviewer decided that "Oh, I won't mention XYZ because it's just a subtle thing", and then XYZ ruins your listening experience? Where do you draw a clear and unambiguous line?

Steve C's picture

Placebo's are used in strict testing procedures not in actual treatment. As a physicist in radiation oncology we do not use placebo's in administering radiation. No designer power cords or DAC's on my equipment, just Klystron's and linear accelerators.

dalethorn's picture

That doesn't address what I said. If a patient agrees to participate in a medication study, it's certainly a patient with a problem who's hoping for a cure or positive treatment, and getting a placebo instead of the real deal is getting stung - seriously.

dpudvay's picture

In medical studies of new drugs, patients who are sick would not receive placebos as new medications studies are against Standard of Care drugs. One set of patients would receive new drugs, the other would receive the standard treatment.

Medical researchers are not out to hurt patients and no drug study would be approved that puts patients at risk by giving them placebos.

Early on in the testing protocol they might test the drug versus placebos in healthy patients, but that is done to determine adverse effects of the drug in question.

Ajani's picture

But why wouldn't the reviewer mention it? It's fine to say that you heard a subtle difference. The problem only really arises when you exaggerate the magnitude of the difference / assume that all your readers have your years of listening experience and will also think the difference is huge.

dalethorn's picture

How do we measure that exaggeration? Really tough call sometimes.

Ajani's picture

Don't measure it. Just stay in touch with the real world AKA talk to non-audiophiles. I severely doubt there's a reviewer alive who doesn't know the intense debates about whether cables make a difference. So if you're writing a cable review, you simply avoid getting over the top about the sonic differences it made, since you know that the average listener is not going to hear what you do. Or of course reviewers could in addition to their usual sighted subjective reviews engage in some blind testing as well. Or again, ask a few non-audiophile family members/friends to take a listen to the item you're reviewing. Any of those ways would help to keep reviews more grounded. Personally I believe that some of these minor differences are blown up because the reviewer knows the prices of the products and so sighted bias kicks in. I don't believe it's all in their imagination as some persons do, but I think it's easy to subconsciously exaggerate when you know you're comparing a $1K amp to a $25K one.

dalethorn's picture

How many people review high-tech audio cables? How many of those reviews say anything worth reading? I still don't see why anyone would review a $200 cable like I bought for my Beyerdynamic DT-1770, and simply say "Well, it's nicely made but doesn't sound any better than a $35 cable". Now granted there are plenty of thrifty audiophiles who search hard for the $35 cable that's "just as good as" the $200 cable, and while their systems probably sound pretty good, I wouldn't want to own those systems unless I were on a severely restricted budget. That's not to say that I endorse $2000 cables because they "must be better" than $200 cables, it just means that I avoid being overly thrifty when it's not necessary to be so.

But in the end, this discussion probably won't go anywhere until you can interview a good sampling of audiophiles who use pricy cables, and at the very least see if they have a conviction that their cables sound better, and how they know that. I experimented with numerous speaker cables back around 1980, connected to Crown power amps or Yamaha integrated amps, and those cables were amazingly different from each other. The notion that an untrained listener couldn't hear those differences never entered my mind, because the differences were profound. Interconnect cables and headphone cables don't have differences of that magnitude in my experience, but sometimes they do have easily audible differences. For me, easily audible means playing tracks that I've heard 1000 times (in testing), so even the smallest difference pops out. Maybe that's one of the big problems with blind testing - unfamiliar music just doesn't work, probably because the brain is so busy digesting the new music and the overall experience that it can't focus on very small differences.

Ajani's picture

I don't think a non-audiophile would consider only being able to readily hear differences on tracks you've played 1000 times to be "easily audible". And that's where the problem lies. If you have to be intimately familiar with the music and/or have hours or days to be able to readily tell the difference, then the difference is really subtle. So writing poetry about its impact on your system just causes non-audiophiles to question your integrity/sanity. No doubt subtle differences can be important to a trained listener, but they probably won't be to the average Joe.

dalethorn's picture

I think you mixed two messages. The non-audiophiles could easily hear the speaker cable differences. They might hear some of the interconnect differences, but that would require time and patience in some cases, and in other cases require training, if they could hear those differences at all.

Steve C's picture

I am a non-audiophile that happens to be a physicist. I got into audio back in the early 70's when I heard my first Klipschorn and McIntosh gear. I've also played piano and trumpet since '67. I believe I know sound when I hear it. You can hear the difference between Yamaha, Getson, Bach, Olds and other manufacturers of trumpets. Same with pianos, flutes, trombones, etc. Subtle differences at best. To me the speaker is the most revealing part of a sound system. I've had AR, Technics time aligned, KLH, Klipsch, yes even Bose 901's and have settled on LaScala's that I have modified with upgraded parts, dynamat on the horns and even rewiring the speaker. Did it make a big difference, eh. What did make the most difference is building a horn sub that added that bottom octave. I believe that would help most budding audiophiles get a start at this hobby.

Ajani's picture

OK. When you experimented with speaker cable, were they all the same gauge?

dalethorn's picture

Some were, by different makers, and some were different gauges of generic electric cord, zipcord, various such names. But there were some premium cables too - Monster was one, and there was one in the mix that had way too much capacitance. So that was a learning process on several fronts. If I were comparing premium or semi-premium cables today, I'd be reading their specs I suppose, to see how close they sound to each other and whether the specs were proportional to those differences etc.

But what I definitely would not be interested in doing is comparing several flavors of one manufacturer's cables (red, blue, green, ....) to each other, since I have a problem with that.

Ajani's picture

Cool. Sounds reasonable. The fact that the wires you compared would have different specs and build quality certainly could impact the sound. The debate around cables, SS amps etc really focuses on whether differences exist between items built to the same specifications. So would a $30 set of cables sound any different from a $3000 set if they were built to the same basic specs? So it's not really that any cables sound the same. A poor quality cable should negatively affect the sound, but more about whether the magic pixie dust sprinkled on expensive cables really makes any difference.

My view is that any difference that can't easily be proven in a traditional DBT is at best a subtle difference. So reviewers shouldn't get all lyrical over these differences/or should at least openly acknowledge that it's likely that the untrained won't hear the difference. Whether that difference is worth your money is really up to you.

dalethorn's picture

Gordon Holt built a dictionary of technical terms he used to describe sound in his reviews. Maybe it's time someone created an extension to that dictionary that gives weighting factors to the terms reviewers use when evaluating cables, tweaks, USB filters, and other passive devices. The better such a dictionary is, the less need there would be for an urban dictionary equivalent to reinterpret it.

John Atkinson's picture
Ajani wrote:
No doubt subtle differences can be important to a trained listener, but they probably won't be to the average Joe.

I think it fair to point out that neither Stereophile nor the high-performance audio industry is interested in the "average Joe" any more than Wine Spectator is interested in what you can find at a 7/11. Both presuppose a concern about improved quality and if someone is not interested in improved quality, they will not be a reader/customer.

A difference may be subtle but that doesn't mean it is unimportant to someone who cares.

Years ago I visited the NRC in Ottawa. There on the wall of their standardized listening room was Floyd Toole's impairment scale for sound quality, with "10" live sound and, IIRC, "3" for a telephone. It struck me back then that everything we discuss in this magazine rates between a "7" and "8": small differences, yes, but not unimportant.

John Atkinson
Editor, Stereophile

Ajani's picture

That's a very good point. The industry really isn't aimed at the average Joe.

However, things get more complicated when you consider that many of the differences described in HiFi mags are things that would take years of experience and knowing what to listen for to notice. So even persons genuinely interested in HiFi get turned off when they can't hear the differences. Especially when they just read an article that was over the top about those differences.

Plus there are countless articles where reviewers question why the general public doesn't care about music or can't hear "obvious" differences in cables, amps, high res etc... Yet those articles generally fail to acknowledge that the differences are really only "obvious" to someone with years of training and not to either the average Joe/beginner/even intermediate audiophiles. So it's not simply that the masses don't care about the differences. Often times they really can't hear them.

dalethorn's picture

Sometimes the differences aren't exactly subtle. Sometimes it's just something an audiophile has previously tuned out as not relevant, or as an insoluble fact of life.

John Atkinson's picture
Ajani wrote:
things get more complicated when you consider that many of the differences described in HiFi mags are things that would take years of experience and knowing what to listen for to notice. So even persons genuinely interested in HiFi get turned off when they can't hear the differences.

But people are not passive listeners. they learn through experience. And an important part of that experience can be provided by a mentor. The problem is that younger people often don't know what to listen for. As Stephen Mejias wrote back in the day, "It takes only a few moments for a more experienced listener to direct someone's attention to the subtler nuances of reproduced sound. We may not hear these details at first, but once we do, we will never forget them. You cannot unlearn the shortcomings of MP3, for instance."

Jon Iverson wrote about the importance of mentoring in his June 2002 "As We See It" essay: www.stereophile.com/asweseeit/588/index.html. He referred to a poll we had conducted on this site where it turned out that most of our readers had gotten their starts at the hands of audiophiles who generously shared their enthusiasm and advice.

Ajani wrote:
it's not simply that the masses don't care about the differences. Often times they really can't hear them.

Of course. But once they find a mentor...

John Atkinson
Editor, Stereophile

music or sound's picture

I tried some online blind tests and I try to test some variations of my system not so blind but my wife listen to it without knowing what is changed.
I found if we hate the music it is very difficult to listen repeatedly to such tracks and evaluate sound differences.
Short music sections (which are often used in blind testing) are not useful to find subtle differences.
If I listen repeatedly to the same track I have a positive impression to the second time and then I am getting bored with that track (there is a bias in the order of samples).
Ideally would be well recorded music that is not much changing over time and the test changes occur within that piece. I know a lot of piano music which would work but i can't think of more complex music.

dalethorn's picture

I do different kinds of tests. One is with David Chesky and Wonjung Kim's Girl From Guatemala - there is an intense treble section starting at 3:00, and runs about 15 seconds. It can reveal tonal differences very well. Another track is the HDTracks copy of Cat Stevens' Morning is broken. That one I test differently - I listen to one segment ("morning is broken") and then switch to component 'B' without stopping playback, and listen to the next segment ("like the first morning"). There are many ways to rearrange that test, but the principle is to not stop playback, but rather do the component switch between each segment or phrase in the song. It really works, and you can hear many differences that way. It's important that the "segments" as I call them have a short pause or silence between them, but not a silence that allows you to "forget" what you just heard. It's much more difficult to hear differences when the music is continuous.

jimtavegia's picture

I have been enjoying listening and making high rez recordings for years now, it is does not take that much of an investment to hear the difference any more with all the great headphones that have come available in the last few years and the number of great high rez USB dacs that are very affordable. To be dug in against High Rez in 2015 means one is losing out of hearing some of the best music ever made presented in the best possible way. When someone hears music through my Focal Spirit Pro's or my AKG K271s or K701 they are shocked at how good it sounds and I don't consider that SOTA in cans these days and not much more than a pair of Beats that I've hoped I convinced them not to buy. I never have to say anything as they are the first to respond to what they have been missing. Often not saying anything is the best thing.

monetschemist's picture

Your Footnote 4 says: Dr. Reiss reported that it didn't prove possible to include Meyer and Moran's data in the meta-analysis, because there were statistical inconsistencies. In particular, the number of listeners who scored positively was just 3 out of 55: far fewer than would be expected if the experimental data were truly random.

This is interesting, can you elaborate on this a bit? From the comment, it sounds like one might have expected ~ 27 to score positively and ~ 27 to score negatively. In any case, if it appears that the results are inconsistent, I'm surprised no one else has noticed that to date. How come?

Thanks in advance!

John Atkinson's picture
monetschemist wrote:
John Atkinson wrote:
In particular, the number of listeners who scored positively was just 3 out of 55: far fewer than would be expected if the experimental data were truly random.
This is interesting, can you elaborate on this a bit?

I asked Dr. Reiss about this discrepancy after the AES workshop, but there was not enough data published for him to explain it. He offered some possible reasons but these could only be conjectures.

John Atkinson
Editor, Stereophile

monetschemist's picture

I wonder if we can conclude (provisionally at least) that, if the statistics are sufficiently anomalous to exclude the study from a meta-analyisis, then we interested listeners should also not give much weight to those same results?

What do you think?

I hope Dr. Reiss publishes his presentation...

And thanks very much for reporting on this.

John Atkinson's picture
monetschemist wrote:
I wonder if we can conclude (provisionally at least) that, if the statistics are sufficiently anomalous to exclude the study from a meta-analyisis, then we interested listeners should also not give much weight to those same results?

I agree. But as I said in this essay, the Meyer-Moran results have been cited many times as "proof" that 16 bits and 44.1kHz sampling are good enough. :-(

It hasn't been until now that Josh Reiss, an expert on statistical analysis, examined the Meyer-Moran data in detail, that it appeared there was something amiss.

monetschemist wrote:
I hope Dr. Reiss publishes his presentation...

I believe that an audio recording will be made available. You can find more information on Dr. Reiss at www.eecs.qmul.ac.uk/~josh/.

monetschemist wrote:
And thanks very much for reporting on this.

You're welcome.

John Atkinson
Editor, Stereophile

joshreiss's picture

just to clarify;

In trials with at least 55 subjects, only one subject had 8 out of 10 correct and 2 had 7 out of 10 correct. The probability of no more than 3 people getting at least 7 out of 10 correct by chance, is 0.97%.

There are question marks with this and other analyses, since not all participants did exactly 10 trials. But its clear that the reported data do not agree with the idea that the results were truly random all the time and for all participants.

joshreiss's picture

The biggest reason that I couldn't include Meyer and Moran's data in the main analysis is that it no longer exists! That is, to include it I would need each participant's average number of correct answers. But this wasn't published and can't be estimated from the paper. I contacted Meyer and Moran, but they said that they didn't keep the data. So I still did statistical analysis with Meyer and Moran wherever possible, but not in the main meta-analysis. However, I also did various checks on the meta-analysis, adding or removing questionable studies, checking the significance of results. The Meyer and Moran study would not have affected the main conclusions.

In regards to statistical inconsistencies, as one example, in trials with at least 55 subjects, only one subject had 8 out of 10 correct and 2 subjects achieved 7 out of 10 correct. The probability of no more than 3 people getting at least 7 out of 10 correct by chance, is 0.97%. Other reported summary data also had less than a 1% chance of occurring if people just guessed on the test.

I plan to submit the meta-analysis as a publication for the AES journal soon.

monetschemist's picture

To me there is quite a bit of irony in this situation - the study considered definitive by many has misplaced its data, so no one can review it further nor incorporate its results fully into a meta-analysis. Oh well.

Leaving that aside, thank you very much for investigating this matter! I hope to see your paper one day.

hbarnum's picture

Just had a look at the Meyer-Moran study. Your calculations above, Josh Reiss, make sense to me although as you say M&M are not very clear about what they did, in particular, how many subjects they had and how many trials each subject did. Still, assuming 55 distinct subjects each did 10 trials, my ballpark calculations are close to yours, starting with calculating the probability of 3 or more correct in 10 trials with 0.5 success probability as 0.17, and then using the binomial distribution assuming n=55 independent 10-trial sequences, one gets a probability of 0.0106 or so that the number of 10-trial sequences yielding 7 or more correct out of 10 would be 3 or fewer. There were actually 554 trials, not 550, so this is obviously a little off; also if some subject did two 10-trial sequences, the independence assumption in that calculation is wrong... and I suspect your calculation is sophisticated enough to deal with that possibility, and the number you get, 0.0097, seems in the right ballpark. John Atkinson's characterization of getting 7/10 or more as a "positive result" in the article clearly confused one of the commenters, but this seems to me to be the essence of the suspicious character of the data. Their statement that "the "best" listener score [...] was 8 for 10, still short of the desired 95% confidence level", is pretty funny in this context. Just reading the M&M article, the description of the methodology seems sketchy, and the lack of data on how many subjects did how many trials is astounding to me. Maybe the lumped analysis is OK without knowing this, but it would seem to complicate things. What do they mean by "we took as much time as we felt necessary to establish the transparency of the CD standard." Was there not a predetermined number of trials? Maybe a non-preset stopping time is not the biggest problem with this study, but still, who publishes an article that effectively states "we took as much time as we felt necessary to establish the conclusion we got." They mention "repeatedly sorting" the data to look for correlations by with age, experience, gender, and high-frequency acuity... it makes one uncomfortable to see on-the-fly data analysis going on in a protocol without predefined stopping point. Anyway, I'm not a statistician or a heavy user of statistics, but to me your point looks spot-on. Something is fishy about the data as described, and without having it it is hard to know why. It is hard to believe that this study is widely cited as authoritative... but then again, Google Scholar gives only 17 citations (as of today) for the paper, not much if it had really been considered authoritative on a controversial issue by serious (publishing) scientists and engineers. (I notice they have no other publications, and one of the has an unspecified BA from Harvard (and work experience in audio, it's true) and the other a BA and an MA both in literature. I don't assume that a lack of credentials automatically means you can't do good work in any given area, but when a study obviously has issues like this one, maybe a lack of understanding of some potential pitfalls led them to unwittingly engage in some dubious practices. And... not keeping the data, especially when you claim to have "shifted the burden of proof" on a controversial issue... that says a lot.

Incidentally, I have a fairly high prior probability, though not based on expertise like yours, for the statement that well-executed conversion to 44.1/16 CD standard does not perceptibly affect the sound... I am very much looking forward to the appearance of your paper, and wondering what my posterior probability will be after reading it.

Howard Barnum

joshreiss's picture

Hi Howard,
Apologies for the late response. I'm really slack at reading and responding to forums and the like.

I am willing to give Meyer and Moran the benefit of the doubt on a lot of things. They never intended to do a controlled test in laboratory conditions. Its attempting to look at real world content in a typical high end home environment, and that's valid. They don't need a scientific background to make an important contribution. And lots of important studies omit details.
But even once giving them the benefit of the doubt on everything, there is, as you say, "Something is fishy about the data."

Citations for their paper are low since hi-res formats is a very hot topic for practicing audio engineers and the wider audio industry, but not really a big topic for scientific researchers. If mentions on audiophile blogs and forums counted, its google scholar citation count would be huge.

My paper has been accepted by the Journal of the AES, by the way, so hopefully it will appear in print and online soon.

fetuso's picture

I'm of the opinion that the only test that really matters is to install the variable (component, cable, music format, etc) in your system and listen for yourself. I read reviews because they're fun and informative, but they are not Gospel.

jastrup's picture

One problem with simple DBTs is that you don’t start every trial with a clean slate. If (with some piece of better equipment) you notice something that you hadn’t noticed before (with poorer equipment), you are now aware of it and may indeed be able to pick it up, even with the lesser piece of equipment.

That’s one issue. A much more serious issue is that I think many audiophiles don’t hear differences right away. I’m certainly one of those. Often I need to listen at least a day or two before I become aware if a piece of equipment has improved the sound or the opposite. And it takes even longer to get a thorough grip on its character.

That doesn’t rule out DBTs. It just means that you have to design your DBT accordingly. It means that you can’t perform them in an afternoon. Each trial must last at least several days, and thus they become very time-consuming/expensive.

I do agree that this is a result of the differences being quite small, but for a discriminating listener they can seem very important, and I think that’s why audiophiles label also small differences with rather strong adjectives.

Ajani's picture

I'm not sure you really benefit from using DBT, if you accept that differences are at best subtle. Most of the DBT obsession comes from the need to disprove seemingly outrageous claims (like when a reviewer claims some $30K RCA cable made a night and day difference in his system).

jastrup's picture

I agree, It's more a matter of principle, i.e. that for DBTs to deliver the proof that e.g. cables make a difference, they need to be properly designed. Or in other words, you can't expect DBT's with multiple trials per day to yield anything other than negative results.

Ajani's picture

In addition to your point, DBTs would need to be done exclusively on trained listeners, since the average Joe is unlikely to be able to hear differences anyway. IMO, if someone really wants to prove whether these expensive cables etc make a sonic difference, then they need to round up a selection of reviewers from Stereophile, TAS, etc and use them for the test. If you can't get good results among the experts, then there's really no point testing any further.

Bill Leebens's picture

If there's a topic that makes me want to blow my brains out more than DBT in general, it could only be DBT of cables.

Or maybe another Republican "debate". Can't wait for opening day.

Allen Fant's picture

I would like to read Dr Reiss' publication (if it goes into print). As a fan of SACD, we all know that any subtle improvement(s) are revealed and I love this aspect of the music.

andy_c's picture

Only intelligent people can see the clothes. This is nothing new. Hans Christian-Andersen nailed it centuries ago.

davidrmoran's picture

>> ... the number of listeners who scored positively was just 3 out of 55: far fewer than would be expected if the experimental data were truly random.

It would be good to find out what Reiss means by this, maybe even before publishing it. What does "scored positively" signify? The scores varied around ~50%, just as one would expect them to.

The number of times the following needs to be said is evidently very large:

(from an academic statistician reviewer) “Given that your test was designed to allow participants every opportunity to demonstrate their ability to discriminate between A and B, you were more concerned that you not get a false positive conclusion than that there not be a possibility of reaching a false negative conclusion.”

and

(from us) "We again invite academic statisticians, and all others who are certain that high-resolution audio is audibly superior to Red Book CD audio for domestic playback, to devise and carry out their own, more robust tests, perhaps with screened listeners, forced choices, identical selections of DSD material, and identical numbers of trials, and so on, and see what those results are."

You might think by now, all these years later, that someone, anyone in the nonsimple audio community would have done the latter and readily reached conclusive, solid results. Wouldn't you?

ednazarko's picture

The whole area of subjective measurement and testing is absurdly sensitive to suggestion, attitude, and all kinds of other issues. (For a great tour of how easily influenceable human opinion can be, read Daniel Kahnemann's book "Thinking Fast and Slow.") If I tell you that we're going to do a double blind test to see whether humans really can hear a difference between 16/44 and higher resolutions, my choice of words (whether...really can) will damp down the subjects' perception of differences. If they do hear one, they'll cast doubt on it because the well was poisoned, but in a way so casual that they're not aware of it enough to resist. If you put a pencil in your mouth horizontally, pushing your lips back, you'll have a more positive take on information presented to you, just as much as if you were smiling. Massively replicated experiment with consistent results. When I present at conferences on decision biases and influences on objectivity, I can get the entire auditorium nodding their head or shaking their heads in unison with me, unknowingly until I point it out.

Because of humans' susceptibility to social cues and inherent perception biases, anyone going in to test something that they don't believe in (or do believe in) will significantly color the results. Odds are good that if I tell you we're going to show whether knowledgeable listeners can detect differences between lower res and higher res files, and I play only one resolution, that subjects will find differences that they'll explain at length, since no one wants to be NOT knowledgeable. "Experts" - particularly those extending their expertise from one subject area to another where they have a lot less real experience - tend to be the most opinionated on a subjective topic.

Testing done by a third party with no going-in opinion, and little knowledge of the area being tested, could produce reasonably reliable results. That's what meta-analyses do - by summing across research done with varying degrees of bias, influence, and skill, you can wash out the influences of the researchers. It's used for that in medical research, and has been very successful in doing so. Vitamin C looked great for preventing colds, Vitamin E for preventing heart attacks, gingko for memory, in many individual studies, and looked awful in others. When meta-analyses were done, it became pretty clear that they had zero effect, which meant that the variations in outcomes represented the researchers' biases, not efficacy or lack thereof. (Many long-recommended medical treatments and protocols have fallen to meta-analyses.)

I'm off to read the paper and evaluate the methods before forming a strong opinion of the results, but I would not be surprised at the conclusion stated here. I've seen some "objective testing" done using music files that were the worst possible choices - music that's pushed up in the loudness wars. I CAN hear differences, quite clearly, on my system, between red book and higher resolutions, but not on all music. Tuned for FM radio and low-res streaming music won't sound any different at 24/192 than at 16/44, something I've learned over the last couple of years, and it's caused me to be a much more careful buyer of high res music. Sometimes Apple compressed is more than adequate file quality for some of the music I like. (Yeah, I'm looking at you, Alabama Shakes.)

Besides my own ears, I've had other proof points. People who visit us and don't know what I'm playing but remark on how they've never heard some song or another sound like THIS, and my wife who swore she really couldn't tell the difference but has noticed many times when I bought a high res version of something we'd listened to for years as red book, have convinced me that my ears aren't fooling me. Recently, a new high res version of something we'd listened to for years in red book set my dogs into a warning bark frenzy as they raced into the family room looking for whoever that was who was speaking (between song chatter on a live album.) In my experience, the "average Joe" (or above average dog) does notice the difference, although without being able to articulate WHAT is different.

ChrisS's picture

...Is the one you do with your own system in your own listening environment with your own ears.

AJ's picture

So Meyer & Moran, with seemingly no pecuniary interests and "skin in the game" so to speak, go on to produce an extensive, but highly blasphemous test that 8yrs later continues to draw the wrath of the believers.
I wonder why still, those *with* strong pecuniary interests in "Hi Rez", who are evidently self trained and could settle this once and for all, have yet to produce a shred of scientifically acceptable (aka controlled test) evidence in support?
Hmmm...

p.s. Kunchur, umm, no..and Stuarts more recent AES paper, well, no..

John Atkinson's picture
AJ wrote:
I wonder why still, those *with* strong pecuniary interests in "Hi Rez", who are evidently self trained and could settle this once and for all, have yet to produce a shred of scientifically acceptable (aka controlled test) evidence in support?

Did you not read my essay? That is exactly what Dr. Reiss's meta-analysis of a large proportion of of the published blind tests showed: that the difference between CD-quality audio and higher-resolution versions could be identified under blind test conditions. And you call others "believers"!

John Atkinson
Editor, Stereophile

AJ's picture

John,

We have no info on the 80 papers or how Dr Reiss cherry picked the 20, other than "sufficient detail/data"...[flame deleted by John Atkinson]
Let's not pop the champagne quite yet.

And when can we expect to see an actual HiRez vs Human rez test like M&Ms, with commercial available music, using self-trained-over-time-audiophiles who claim to hear the "improvement" of higher "resolution" 2ch stereo constructs on sufficiently "revealing" systems?

John Atkinson's picture
AJ wrote:
We have no info on the 80 papers or how Dr Reiss cherry picked the 20, other than "sufficient detail/data"...

No cherries were picked. Dr. Reiss presented full detail at the Audio Engineering Society workshop on the tests he was discussing and why he discarded more than half them them before performing the meta-analysis. Some failed to restrict the variables to just one - one published paper, for example, failed to account for level differences between the files, so the results were thus meaningless. Others, like Meyer-Moran, had statistical anomalies that suggested that the data were not sufficiently complete to be included.

An audio recording of the workshop is available for $18. Go to www.mobiltape.com/conference/Audio-Engineering-Society-139th-Convention and scroll down the page to 15AES-W20 - "Perceptual Evaluation of High Resolution Audio."

And when you write in another posting:

AJ wrote:
I do find it fascinating how audiophiles utterly reject any form of controlled/blind tests...

Note that my own rejection of blind testing as commonly practiced is due to major procedural problems. As I write in the essay above that you don't appear to have read thoroughly, "I have been involved in well over 100 such tests, as listener, proctor, or organizer. My opinion on their efficacy and how difficult it is to get valid results and not false negatives—ie, reporting that no difference could be heard when a small but real audible difference exists—has been formed as the result of that experience."

The title of the essay also reflects that experience: that the demands for blind tests are very often made by those who have no experience of such tests. Such people thus have no idea of how difficult it is to design and perform a valid test that produces meaningful results.

John Atkinson
Editor, Stereophile

AJ's picture

...but I've skipped the middle man and contacted Dr Reiss directly.
Not sure where you're getting that from, but it appears M&Ms test *will* be included. Looking forward to the paper.

I'm still confused over the official position of Stereophile regarding the relevance of blind audio tests. They are no longer to be dismissed if "valid" and "meaningful results" are obtained?
That sounds like progress, similar to what almost every major orchestra now uses.
Even if 99% of the readership and Lavorgna et al might disagree vehemently. ;-)
Happy New Year

John Atkinson's picture
AJ wrote:
John Atkinson wrote:
Others, like Meyer-Moran, had statistical anomalies that suggested that the data were not sufficiently complete to be included.

Not sure where you're getting that from...

This was what I was told in my discussion of his presentation with Dr. Reiss after the workshop last October.

AJ wrote:
I'm still confused over the official position of Stereophile regarding the relevance of blind audio tests. They are no longer to be dismissed if "valid" and "meaningful results" are obtained?

My position has not changed over the almost 40 years I have been taking part in blind testing. I am not the one confused here.

John Atkinson
Editor, Stereophile

AJ's picture

flame deleted by John Atkinson

krabapple's picture

I eagerly await Dr. Reiss's peer-reviewed publication of his meta-analysis.

Meanwhile I remain amazed and amused whenever this sort of testimony is trotted out by someone who swears by *sighted* 'testing':
"Note that my own rejection of blind testing as commonly practiced is due to major procedural problems. "

As if sighted evaluation of audio doesn't have 'major procedural problems' of its own...so major, in fact, that its use would merit certain rejection, by any reputable scientific journal, of any paper whose authors were incompetent enough to use it as a 'method'.

ChrisS's picture

Then, please, cite any "audio review journal" that uses blind testing to review audio equipment.

krabapple's picture

Congratulations, you've identified precisely why the 'reviews' published in audio magazines should be taken at best with a boulder of salt, and at worst disregarded completely.

ChrisS's picture

...Audio Industry seem to have gotten along quite well all these years without anyone having to do blind testing.

Who do you know shops for any consumer product, let alone audio equipment, by doing blind testing?

A listing of...
"John heard a difference between power cord #3 and power cord #15.
John did not hear a difference between interconnect #122 and interconnect #576, etc..."
is of limited use and of limited interest to a very few people.

So...no one does blind testing.

krabapple's picture

Odd, then, that it's the only sort of data of interest to science, eh? You know, when people want to know the truth.

You are mistaken if you think reports of preference -- #2 sounds better than #3 -- are excluded by blind testing. There is blind testing for difference, and there is blind testing for preference. The actual methods are somewhat different, but in both cases, the key is that the listener is basing his responses *only* on the sound he hears, not the appearance, price, brand, etc.

You find that a radical idea? You can bet high end 'power cord' makers would find it scarily so.

ChrisS's picture

...is totally subjective and the test results are of value only to the people being tested.

Anyone being tested, other than myself, will have totally different ears, brain, test/sound equipment, listening environment, music preference, etc. and so the test results will have no relevance to me.

Coke or Pepsi? Doesn't matter what anyone else prefers, I prefer Coca-Cola anytime!

Reread this article and John's other articles about the rigors and proper use of blind testing.

Science and Truth?

Were people told the truth about tobacco products and cancer? How about Oxycodone?

"Odd, then, that it's the only sort of data of interest to science, eh? You know, when people want to know the truth."

Odd? Really? The REAL truth is often tragic, and in these cases, criminal.

I don't think many people have died after buying an "expensive" power cord.

AJ's picture
Quote:

Then, please, cite any "audio review journal" that uses blind testing to review audio equipment.

http://www.livescience.com/44651-new-violins-beat-stradivarius.html

https://www.princeton.edu/pr/pwb/01/0212/7b.shtml

etc, etc, etc.

John Atkinson's picture
AJ wrote:
ChrisS wrote:
Then, please, cite any "audio review journal" that uses blind testing to review audio equipment.

http://www.livescience.com/44651-new-violins-beat-stradivarius.html

https://www.princeton.edu/pr/pwb/01/0212/7b.shtml

Neither of these links refer to "audio review journals" and reviews of audio equipment.

John Atkinson
Editor, Stereophile

AJ's picture

Audio listening tests that yielded totally different results when expectation, delusion and chauvinistic bigotry etc. were removed.
Some crave otherwise of course.

ChrisS's picture

...there's gender bias.

We already know there are master crafts"men" out there, who make superb instruments.

We don't need blind tests to tell us (again...).

AJ's picture
Quote:

We don't need blind tests

Many audiophiles don't, but orchestras, etc. who want choices based on sound, free from delusion, bias, sexism, bigotry, etc. do.

That's precisely why the audio test links above have completely different outcomes blind. Those features are to be avoided...by some. Not all of course.

ChrisS's picture

...is a social problem that needs to be dealt with on a much broader scale and at a deeper level. Blind auditioning is only one way of circumventing biased hiring practices.

At issue here is blind testing of audio equipment. Basically, no one in the industry does it.

John Atkinson's picture
I have to ask why you subscribe to Stereophile if you place no trust in what we write?

John Atkinson
Editor, Stereophile

AJ's picture

There is nothing to "trust" with subjectivity.

ChrisS's picture

Most people get along by listening for themselves.

If you can't trust yourself, then there's not much left to do...is there?

If you don't trust your own hearing, then there's not much that others, like the writers of Stereophile or even audiologists, can do for you.

John Atkinson's picture
AJ wrote:
There is nothing to "trust" with subjectivity.

So again I ask the question you are avoiding: if that's how you feel, why do you subscribe to Stereophile? Why do you hang out on this website?

John Atkinson
Editor, Stereophile

AJ's picture
Quote:

So again I ask the question you are avoiding: if that's how you feel, why do you subscribe to Stereophile? Why do you hang out on this website?

Where is it stated all must "trust" the subjective opinions of writers, as a rule of posting?
That no discussion of articles are allowed with the "simple", to use you your description of those you despise?
I certainly had no reason to trust your claim the M&M test would be excluded from the paper, given what Dr Reiss wrote me directly.
You prefer only uncritical, zero questions posts?

John Atkinson's picture
AJ wrote:
John Atkinson wrote:
So again I ask the question you are avoiding: if that's how you feel, why do you subscribe to Stereophile? Why do you hang out on this website?

Where is it stated all must "trust" the subjective opinions of writers, as a rule of posting?

It isn't, of course. But given your very public rejection of what we publish, why are you here at all?

AJ wrote:
That no discussion of articles are allowed with the "simple", to use you your description of those you despise?

Please do not put words in my mouth. I neither despise nor applaud you. I am simply wondering why you read Stereophile and visit this website when you have so little interest in what we have to say.

AJ wrote:
I certainly had no reason to trust your claim the M&M test would be excluded from the paper, given what Dr Reiss wrote me directly.

You didn't have to trust me. You simply had to purchase the recording of the session or even look at the slide of Dr. Reiss's that he presented at the workshop on the second page of this reprint, explaining why he felt there was an issue with the Meyer-Moran data. Note that Dr. Reiss writes "even if we believe their data, it doesn't appear random."

But if you are calling me dishonest, have a care: you are a guest on this site; please behave like one.

So for the third time, please answer my question if you wish your postings to be seen as something other than mere trolling.

John Atkinson
Editor, Stereophile

AJ's picture
Quote:

It isn't, of course. But given your very public rejection of what we publish, why are you here at all?

[Flame deleted by John Atkinson]
I'm simply here discussing *This article*, as it pertains to a particular AES presentation, of which I am a member and also have interest in. I've posted on AES comment sections as well, pertaining to "Hi Rez".

Quote:

Please do not put words in my mouth. I neither despise nor applaud you.

You weren't referring to "me", but you used "simple" as a term of endearment to describe those who hold opposite views?

Quote:

I am simply wondering why you read Stereophile and visit this website when you have so little interest in what we have to say.

John, my post count here outside this thread is near non-existent, no statistical mining required. That should reveal my actual "level of interest".

Quote:

So for the third time, please answer my question if you wish your postings to be seen as something other than mere trolling.

I have answered. My interest is in *the subject*, [flame deleted by John Atkinson]

John Atkinson's picture
AJ, I have deleted text of yours that appeared to be nothing more than an attempt to pick a fight. As I warned you earlier, you are a guest on this site and are expected to behave like one. Please address the subject, not individual posters or myself.

Regarding the title of this essay, I was referring to the fact that many of the strongest advocates of blind testing have no experience of such testing. Their lack of direct knowledge leads them to believe that rigorous blind testing is much easier than it really is.

John Atkinson
Editor, Stereophile

ChrisS's picture

Every single instrument and every single musician will sound different from another to any and all discerning ears.

What these two citings show can be summarized by answering one question, "Do you like what you hear?"

Totally subjective and really doesn't matter whether the "auditioning" in these two cases is done sighted or not.

It's the same Coke/Pepsi challenge.

John Atkinson's picture
krabapple wrote:
I eagerly await Dr. Reiss's peer-reviewed publication of his meta-analysis.

On a separate page, I have added the image of one of Dr. Reiss's slides he presented at the AES workshop showing why he found the Meyer-Moran results statistically anomalous. See www.stereophile.com/content/simple-everything-appears-simple-meyer-moran-test-statistics.

John Atkinson
Editor, Stereophile

krabapple's picture

Neat. I eagerly await Dr. Reiss's peer-reviewed publication of his meta-analysis.

joshreiss's picture

I'm trying to include every study possible. As examples of what couldn't be included;
* Some didn't actually have a new experiment, just discussed ideas for one;
* A few simply reported that no one could detect a difference but gave no data
* A few had such unusual methodologies that they couldn't be compared against others, like one which found that the precision in timing estimation for transient signals is improved when the experiment is done with 96khz audio compared to 48khz.

In short, I've used every experiment that I can, even those that I thought were poorly designed. And then discussed and analysed the effect of bias or poor design in each questionable experiment.

And now I'd better get back to writing the paper. :)

AJ's picture

I do find it fascinating how audiophiles utterly reject any form of controlled/blind tests...except in the rare instances where they might tell them what they want to believe :-)

ChrisS's picture

"...controlled/blind tests...where they might tell them what they want to believe..."

Appears to happen in the pharmaceutical industry. Please cite where this is the case in audio reviews.

ChrisS's picture
Jazzlistener's picture

So those who hold a different opinion in relation to blind testing are soft in the head according to Mr. Atkinson. And as far as average Joe music enthusiasts go, they can get bent because Stereophile isn't interested in them anyway. No, Stereophile is like Wine Spectator we're told, not for your garden variety hobbyist. Well, so much for all the talk about courting the younger generation of audio enthusiasts. Way to set the hobby back 50 years. I have been an on-again off-again reader of Stereophile for more than a decade. Looks like an extended hiatus is in order for me. Thoroughly disgusted, and frankly, bored.

John Atkinson's picture
Jazzlistener wrote:
So those who hold a different opinion in relation to blind testing are soft in the head according to Mr. Atkinson.

You seem to have misunderstood what I wrote. Which was that those who call for blind testing very often have no experience of blind testing, thus are ignorant of the issues involved in designing such a test when the sound quality differences may be real but are small.

John Atkinson
Editor, Stereophile

David Harper's picture

It occurs to me that my statement that humans cannot hear any difference between CD and hi-res was simpleminded. I suspect that with the most highly resolving state of the art equipment,with electrostatic speakers, many people might perceive the difference. I guess my point then would be that since 99.99% of us will never own that system,the difference is irrelevant.

davidrmoran's picture

I have been wondering if Reiss, and the good Atkinson, know what Reiss's charged terms actually mean. (I am also not sure I, and maybe Meyer, have the energy to respond to that paper. Its anecdote and postings inclusion made me think that maybe Lancet could consult Reiss and start including anti-vax online screeds as part of their own meta-analyses going forward ....) But for now, if a fair coin comes up heads 8x in a row, is that truly an anomaly --- 'deviating from what is standard, normal, or expected'? Really? Seriously? (Hint: no.)

Our data are our data. Repeatedly saying something is 'wrong' with them, as Reiss, and before him Dranove, charged, is effectively to say we lied, also 'I don't believe your results.' And that would be a different discussion, wouldn't it?

All our data enumerate what actually happened.

Moreover, some posters here and elsewhere still have not actually studied the paper and the elaboration and ask fool things like how many subjects and assert the gear was not resolving enough. Yawn.

You might think instead that someone would publish their own test showing the dramatic difference hi-rez makes. But in almost a decade now, much less longer, nooooo.

So come on, Atkinson, the commercially interested Stuart, Reiss, all others, do the test, train the listeners, force the choice, don't use crap filters as in the Stuart preprint, and just do it. It ain't that hard. We did it, perfectly or imperfectly. Have at it. Conclusively blind-show the superiority of hi-rez.

Ah, I thought so. Bwok bwok bwok.

X