CD Tweaks & Listening Tests
An analog recording and playback process will preserve the natural, pseudo-fractal character of the sound at the expense of an increase in the level of the already present noise, which is itself fractal. By contrast, a digital recording and playback process is fundamentally unnatural in that it fails to preserve the pseudo-fractal nature of real sounds below a certain threshold. Instead, that fine detail in the original signal fails to be recorded at all. In addition, the noise that the quantizing process (if undithered) adds is also unnatural in that it is directly related to the input waveform rather than having a random nature; it is therefore a distortion.
There is also the effect of datastream jitter in the player or decoder to be considerd. Jitter introduces the equivalent of quantizing distortion, but is said to be rendered less harmful by the various CD tweaks that are being promoted. In my postscript to Robert Harley's article on CD tweaks last May (Vol.13 No.5), I described the results of experiments I had done to examine the nature of the noise floor of the final analog signal when a CD had been treated by having its edge coated with CD Stoplight.
Although proponents of the "Bits is bits" school of audio engineering insist that nothing can be done to improve the quality of the data signal retrieved from the CDremember that though this signal represents digital data, it is in fact an analog signalI found that as well as repeatable changes in the shape of the noise floor adjacent to a recorded tone, CD Stoplight appeared to lower the overall noise floor slightly. As these effects are at the limit of hearing, perhaps they help the recovered analog signal to be reproduced with more of the original, pseudo-fractal crinkliness. Remember, too, that those who have reported on the subjective improvement wrought by these tweaks say that the sound becomes more realistic, more analog-like.
It seemed appropriate, therefore, to take advantage of the April 1990 High End Hi-Fi Show in New York to investigate the overall audibility of these CD tweaks in a traditional double-blind test, recruiting showgoers as listeners.
The Test Procedure
Stereophile's Robert Harley argued most persuasively in the July 1990 issue, in "As We See It," that the nature of listening under double-blind conditions is sufficiently different from the natural state of listening to music that test results gained under those conditions are, at worst, meaningless, and, at best, of limited transportability (footnote 1). Nevertheless, the differences that I had heard in my own system due to the use of CD Stoplight, and to the use of a better CD transport in particular, were large enough that I felt that it might be possible for them still to be identified under what Stereophile's critics are pleased to call "rigorous" conditions. I also felt it a worthwhile opportunity for a few hundred of the magazine's readers to have the experience of taking part in a well-organized listening test.
The facilities to conduct these tests offered by Manhattan's Penta Hotel on Seventh Avenue were promising. We would be able to carry out the listening tests in the foyer of the hotel's Gold Ballroom, a large, well-proportioned rectangular room well away from the rest of the show, thus minimizing the leakthrough of noise from competing demonstrations. This was something that had been a major problem at the Dunfey San Mateo Hotel, the site of the amplifier listening tests reported in the July 1989 issue of Stereophile. There would be room for approximately 50 seated listeners to take part in each of the eight test sessions (though, as it turned out, several sessions also had a number of standing listeners, giving a total of 461 listeners and therefore 3227 separate tests).
My first decision was to decide what CD tweaks should be auditioned for the tests. Ultimately, I decided that the listeners would be given the opportunity to compare the sound of a plain-Jane, untreated CD played in a good but relatively inexpensive transport, the Philips CD880, with that of a CD coated with Armor All, the edges of which had been painted with CD Stoplight, played in the $4000 Esoteric P-2 transport. The digital outputs of the two transports would feed two of the data inputs of a Proceed PDP D/A processor; the input switching of the Proceed would allow the sound produced by its DACs when fed one digital datastream to be compared with the sound output by its DACs when fed a supposedly different datastream. The analog circuitry and playback level would therefore be identical for both test conditions; the only changesif anywould be in the digital domain. In addition, the digital output of the Philips was connected to the Proceed with a 1m length of standard 75-ohm RF coaxial cable; that of the Esoteric with the highly recommended Music And Sound interconnect.
It could be (and I am sure will be) fairly pointed out that even if differences were heard, we would have no idea of the exact cause. Audible differences could be ascribed to the effect of the Armor All, the CD Stoplight, the Esoteric transport, the different digital interconnects, or to any combination of these factors. It would therefore not be a truly scientific experiment, in which just one variable at a time would be changed. This didn't worry me, however. I was not interested in determining which, if any, of these treatments would lead to an audible difference; only to see if the application of these tweaks could do so. As all of them have been ridiculed in the hi-fi slicks as not possibly having any audible effect, any difference heard would support the statements made in Stereophile and other high-end publications: that such tweaks as these can change, even improve, the sound of CDs.
As in the 1989 amplifier tests, the loudspeakers used would be the Stereophile-owned pair of B&W Matrix 801s, sitting on Arcici wooden stands. These were driven in bi-wired fashion via two 25' sets of AudioQuest Clear cable by a Krell KSA-250 power amplifier. The amplifier's measured maximum output level during the tests was around 20V RMS, well within its capabilities. Plugged into the Krell's inputs was a pair of Electronic Visionary Systems Ultimate Attenuators, these comprising high-precision switched-resistor networks to control level. The Krell in turn was driven by the MaughamBox from Denver dealer Listen-Up, this supplying the needed low-frequency equalization to render the B&Ws flat to 20Hz. The equalizer box was connected directly to the balanced outputs of the Proceed. All the analog interconnects, both balanced and unbalanced, were 1m or 2m lengths of AudioQuest Lapis.
|Table 1: Music Program (in order of presentation)|
|1 Symphony 5 (Movement 1)||Mahler||VPO/Bernstein||DG 423 608-2|
|2 "It was a Lover"||Morley||Julianne Baird||Dorian DOR-90126|
|3 Karelia Suite (Mvt 3)||Sibelius||Ashkenazy||London 414534-2|
|4 "Diamonds on the Soles..."||Paul Simon||Paul Simon||Warner 9 25447-2|
|5 Concerto in d (BWV596)||JS Bach||James Johnson||Stereophile 002-2|
|6 Scherzo in b-flat (Op.31)||Chopin||Anna-Maria Stanczyk||Stereophile 002-2|
|7 Praeludium||Järnefelt||Delaware SO||Stereophile 002-2|
Choosing the music program took some thought. Not only would it have to be capable of revealing any differences, it would also have to be sufficiently appealing to retain its musical value after being played some 32 times during the weekend's testing. Table 1 shows the seven pieces I finally decided upon. Large-scale orchestral music was represented by the opening of Mahler's Symphony 5 and the final movement of Sibelius's Karelia Suite; smaller-scale orchestral music, captured in a purist manner, by J. Gordon Holt's recording of the Järnefelt Praeludium. Voice was represented by the Julianne Baird and Paul Simon recordings, the latter featuring the distinctive unaccompanied sounds of Ladysmith Black Mambazo, while for solo instrumental, there was my piano recording and Peter Mitchell's organ recording, both from the Stereophile Test CD. As each pair of discs had been obtained at the same time, it was to be hoped that any intrinsic sonic differences were minimal. I treated the playing surface of one of each pair with Armor All (footnote 2) and its outside edge with CD Stoplight.
To avoid any suggestion that I would or could bias either the tests or the results, my collaborator from the 1989 amplifier listening tests, Will Hammond, (footnote 3) would operate all the controls and switching so that I would not know what the listening panel was listening to. Will would also analyze all the results. Each test therefore took the following form:
The seven pieces of music, each lasting between 60 and 90 seconds, would be played four times. The first two times would be for learning purposes, and were identified to the listeners; ie, Will would tell them what they were listening to, either the untreated CD in the Philips transport or the tweaked CD in the Esoteric transport. The next two presentations of the music would be the actual blind test. I would place the CDs in both transports and press Play simultaneously. According to a random sequence worked out earlier, Will would select either tweaked or non-tweaked to be the first presentation, then either to be the second, giving four possible combinations. (He actually switched the Proceed's input selector switch a number of times between every presentation so that ultra-keen-eared participants would not be able to score correctly by "counting the clicks." All the listeners had to do was to decide on the basis of the sounds they heard whether the fourth presentation was the same as the third or different. (The actual question was "Was A the same as B, Yes or No?") They were also asked to mark on the scoresheet where they were sitting, according to a printed grid, and to indicate their sex and age, Will wanting to examine the data on these bases as well as as a whole.
Note that the listeners weren't asked to identify either presentation as being tweaked, only whether they heard a difference or not. By thus reducing the intellectual work to be done to a minimum, I hoped that listeners would be more receptive to the aural differences. Needless to say, many listeners did attempt identification, which I have always found to be a confusing factor in blind tests.
A note on the random sequence of presentations is in order. A computer using a random-number generator program decided on whether each of the seven presentations for each of the eight sessions would be Same or Different, whether the Same presentations would be both tweaked (B-B) or both non-tweaked (A-A), and whether the Different presentations would start with the tweaked first (B-A) or the non-tweaked (A-B). Unlike last year's amplifier tests, where we had inadvertently presented more Different than Same presentations, the 56 presentations included 14 each of A-A, B-B, A-B, and B-A. I decided on one slight departure from a truly random order: There were no sessions where there were more than three Sames or Differents in a row.
Although four or more identical presentations is statistically quite likely out of a total of 56try tossing coins 56 times and note how often you get four or more heads or tails in a rowmy experience of listening tests indicates that the listeners get disturbed very easily when presented with more than three such occurrences. They disbelieve the evidence of their own ears and start to guess, which is counterproductive, to say the least. If all that I was interested in was the results of guessing, I could run the test from the comfort of my home in Santa Fe, all the participants phoning in their scores from the comfort of their homes.
The test procedure is obviously single-blind in that, although the listeners (and I) couldn't know what they were listening to, Will did. This doesn't affect the validity of the results, however, Will taking great care not to give any visual clues.
It is also important to note that the listeners were not told that they would be subjected to seven tests. In my experience, there is a "last test" syndrome: whether due to relief or whatever, I have found people's scoring to randomize on the last test in a series. Though the scoresheet had eight tests marked on it, we therefore stopped the testing after the seven pieces of music. This was much to everyone's relief. To maintain the concentration required for this kind of test for the 45 minutes that it took is a terrific strain, as I am sure all those who took part will testify.
I would therefore like to express my belated thanks to all of you, as well as to Madrigal, AudioQuest, Krell, Music And Sound, Philips, Esoteric, Electronic Visionary Systems, and Listen Up, who loaned Stereophile the equipment for the tests, and to Richard Lehnert, Robert Harley, Allen St. John, Beth Jacques, and all the other Stereophile staff who distributed tickets and helped move people in and out of the listening room.
Footnote 1: There is also an important difference between double-blind testing as applied to wine and medical researchtwo areas examined in this month's "Letters" columnand audio. In the first two, the test directly examines the response of the test subjects to the stimulus; with audio, the test can only examine the responses of the subjects in an indirect manner, via their reactions to something which is itself multidimensional, unmeasurable, and by definition intended to elicit a complex subjective reactionmusic. I do not find it coincidence that the most successful double-blind tests which I have organized or in which I have participated have used a non-musical, artificial test signal that does not vary with time. Note also Ken Pohlmann's CD player listening tests in the October 1990 issue of Stereo Review, where he found that the listeners were much better at identifying differences blind with a test tone than with music. I suggest that it is notthe masking effect of music that obscures differences, as is commonly held, but the fact that music itself changes sufficiently with timeisn't that its definition?to render quick ABX tests much less sensitive than with unmusical tones.John Atkinson
Footnote 2: At the time of writing, it is almost four months since I applied the Armor All to these discs. They so far don't appear to have a higher incidence of errors than their untreated siblings. I'll keep you posted if I notice any changes.John Atkinson
Footnote 3: For many years the co-producer and co-presenter of Peter Sutheim's "In-Fidelity" program in Los Angeles (Radio KPFK, 90.7 FM, Sundays at noon). Will's professional career is in biomedical research, with a heavy involvement in controlled clinical trials.John Atkinson