CD Tweaks & Listening Tests

In "Music and Fractals" in the November 1990 issue, I discuss how digital audio's quantization of amplitude information in what was originally a continuous waveform represents a fundamental difference between analog and digital representations of music. In a letter published in the English magazine Hi-Fi Review in January 1990, John Lambshead conjectured that naturally originating sounds were pseudo-fractal in character; that is, their waveforms have a wealth of fine detail, and that detail itself has an even finer-structured wealth of fine detail, and so on, until the crinkliness of the waveform is finally enveloped in the analog noise that accompanies every sound we hear.

An analog recording and playback process will preserve the natural, pseudo-fractal character of the sound at the expense of an increase in the level of the already present noise, which is itself fractal. By contrast, a digital recording and playback process is fundamentally unnatural in that it fails to preserve the pseudo-fractal nature of real sounds below a certain threshold. Instead, that fine detail in the original signal fails to be recorded at all. In addition, the noise that the quantizing process (if undithered) adds is also unnatural in that it is directly related to the input waveform rather than having a random nature; it is therefore a distortion.

There is also the effect of datastream jitter in the player or decoder to be considerd. Jitter introduces the equivalent of quantizing distortion, but is said to be rendered less harmful by the various CD tweaks that are being promoted. In my postscript to Robert Harley's article on CD tweaks last May (Vol.13 No.5), I described the results of experiments I had done to examine the nature of the noise floor of the final analog signal when a CD had been treated by having its edge coated with CD Stoplight.

Although proponents of the "Bits is bits" school of audio engineering insist that nothing can be done to improve the quality of the data signal retrieved from the CD—remember that though this signal represents digital data, it is in fact an analog signal—I found that as well as repeatable changes in the shape of the noise floor adjacent to a recorded tone, CD Stoplight appeared to lower the overall noise floor slightly. As these effects are at the limit of hearing, perhaps they help the recovered analog signal to be reproduced with more of the original, pseudo-fractal crinkliness. Remember, too, that those who have reported on the subjective improvement wrought by these tweaks say that the sound becomes more realistic, more analog-like.

It seemed appropriate, therefore, to take advantage of the April 1990 High End Hi-Fi Show in New York to investigate the overall audibility of these CD tweaks in a traditional double-blind test, recruiting showgoers as listeners.

The Test Procedure
Stereophile's Robert Harley argued most persuasively in the July 1990 issue, in "As We See It," that the nature of listening under double-blind conditions is sufficiently different from the natural state of listening to music that test results gained under those conditions are, at worst, meaningless, and, at best, of limited transportability (footnote 1). Nevertheless, the differences that I had heard in my own system due to the use of CD Stoplight, and to the use of a better CD transport in particular, were large enough that I felt that it might be possible for them still to be identified under what Stereophile's critics are pleased to call "rigorous" conditions. I also felt it a worthwhile opportunity for a few hundred of the magazine's readers to have the experience of taking part in a well-organized listening test.

The facilities to conduct these tests offered by Manhattan's Penta Hotel on Seventh Avenue were promising. We would be able to carry out the listening tests in the foyer of the hotel's Gold Ballroom, a large, well-proportioned rectangular room well away from the rest of the show, thus minimizing the leakthrough of noise from competing demonstrations. This was something that had been a major problem at the Dunfey San Mateo Hotel, the site of the amplifier listening tests reported in the July 1989 issue of Stereophile. There would be room for approximately 50 seated listeners to take part in each of the eight test sessions (though, as it turned out, several sessions also had a number of standing listeners, giving a total of 461 listeners and therefore 3227 separate tests).

My first decision was to decide what CD tweaks should be auditioned for the tests. Ultimately, I decided that the listeners would be given the opportunity to compare the sound of a plain-Jane, untreated CD played in a good but relatively inexpensive transport, the Philips CD880, with that of a CD coated with Armor All, the edges of which had been painted with CD Stoplight, played in the $4000 Esoteric P-2 transport. The digital outputs of the two transports would feed two of the data inputs of a Proceed PDP D/A processor; the input switching of the Proceed would allow the sound produced by its DACs when fed one digital datastream to be compared with the sound output by its DACs when fed a supposedly different datastream. The analog circuitry and playback level would therefore be identical for both test conditions; the only changes—if any—would be in the digital domain. In addition, the digital output of the Philips was connected to the Proceed with a 1m length of standard 75-ohm RF coaxial cable; that of the Esoteric with the highly recommended Music And Sound interconnect.

It could be (and I am sure will be) fairly pointed out that even if differences were heard, we would have no idea of the exact cause. Audible differences could be ascribed to the effect of the Armor All, the CD Stoplight, the Esoteric transport, the different digital interconnects, or to any combination of these factors. It would therefore not be a truly scientific experiment, in which just one variable at a time would be changed. This didn't worry me, however. I was not interested in determining which, if any, of these treatments would lead to an audible difference; only to see if the application of these tweaks could do so. As all of them have been ridiculed in the hi-fi slicks as not possibly having any audible effect, any difference heard would support the statements made in Stereophile and other high-end publications: that such tweaks as these can change, even improve, the sound of CDs.

As in the 1989 amplifier tests, the loudspeakers used would be the Stereophile-owned pair of B&W Matrix 801s, sitting on Arcici wooden stands. These were driven in bi-wired fashion via two 25' sets of AudioQuest Clear cable by a Krell KSA-250 power amplifier. The amplifier's measured maximum output level during the tests was around 20V RMS, well within its capabilities. Plugged into the Krell's inputs was a pair of Electronic Visionary Systems Ultimate Attenuators, these comprising high-precision switched-resistor networks to control level. The Krell in turn was driven by the MaughamBox from Denver dealer Listen-Up, this supplying the needed low-frequency equalization to render the B&Ws flat to 20Hz. The equalizer box was connected directly to the balanced outputs of the Proceed. All the analog interconnects, both balanced and unbalanced, were 1m or 2m lengths of AudioQuest Lapis.

Table 1: Music Program (in order of presentation)
Song/Work Composer Artist(s) CD number
1 Symphony 5 (Movement 1) Mahler VPO/Bernstein DG 423 608-2
2 "It was a Lover" Morley Julianne Baird Dorian DOR-90126
3 Karelia Suite (Mvt 3) Sibelius Ashkenazy London 414534-2
4 "Diamonds on the Soles..." Paul Simon Paul Simon Warner 9 25447-2
5 Concerto in d (BWV596) JS Bach James Johnson Stereophile 002-2
6 Scherzo in b-flat (Op.31) Chopin Anna-Maria Stanczyk Stereophile 002-2
7 Praeludium Järnefelt Delaware SO Stereophile 002-2

Choosing the music program took some thought. Not only would it have to be capable of revealing any differences, it would also have to be sufficiently appealing to retain its musical value after being played some 32 times during the weekend's testing. Table 1 shows the seven pieces I finally decided upon. Large-scale orchestral music was represented by the opening of Mahler's Symphony 5 and the final movement of Sibelius's Karelia Suite; smaller-scale orchestral music, captured in a purist manner, by J. Gordon Holt's recording of the Järnefelt Praeludium. Voice was represented by the Julianne Baird and Paul Simon recordings, the latter featuring the distinctive unaccompanied sounds of Ladysmith Black Mambazo, while for solo instrumental, there was my piano recording and Peter Mitchell's organ recording, both from the Stereophile Test CD. As each pair of discs had been obtained at the same time, it was to be hoped that any intrinsic sonic differences were minimal. I treated the playing surface of one of each pair with Armor All (footnote 2) and its outside edge with CD Stoplight.

To avoid any suggestion that I would or could bias either the tests or the results, my collaborator from the 1989 amplifier listening tests, Will Hammond, (footnote 3) would operate all the controls and switching so that I would not know what the listening panel was listening to. Will would also analyze all the results. Each test therefore took the following form:

The seven pieces of music, each lasting between 60 and 90 seconds, would be played four times. The first two times would be for learning purposes, and were identified to the listeners; ie, Will would tell them what they were listening to, either the untreated CD in the Philips transport or the tweaked CD in the Esoteric transport. The next two presentations of the music would be the actual blind test. I would place the CDs in both transports and press Play simultaneously. According to a random sequence worked out earlier, Will would select either tweaked or non-tweaked to be the first presentation, then either to be the second, giving four possible combinations. (He actually switched the Proceed's input selector switch a number of times between every presentation so that ultra-keen-eared participants would not be able to score correctly by "counting the clicks." All the listeners had to do was to decide on the basis of the sounds they heard whether the fourth presentation was the same as the third or different. (The actual question was "Was A the same as B, Yes or No?") They were also asked to mark on the scoresheet where they were sitting, according to a printed grid, and to indicate their sex and age, Will wanting to examine the data on these bases as well as as a whole.

Note that the listeners weren't asked to identify either presentation as being tweaked, only whether they heard a difference or not. By thus reducing the intellectual work to be done to a minimum, I hoped that listeners would be more receptive to the aural differences. Needless to say, many listeners did attempt identification, which I have always found to be a confusing factor in blind tests.

A note on the random sequence of presentations is in order. A computer using a random-number generator program decided on whether each of the seven presentations for each of the eight sessions would be Same or Different, whether the Same presentations would be both tweaked (B-B) or both non-tweaked (A-A), and whether the Different presentations would start with the tweaked first (B-A) or the non-tweaked (A-B). Unlike last year's amplifier tests, where we had inadvertently presented more Different than Same presentations, the 56 presentations included 14 each of A-A, B-B, A-B, and B-A. I decided on one slight departure from a truly random order: There were no sessions where there were more than three Sames or Differents in a row.

Although four or more identical presentations is statistically quite likely out of a total of 56—try tossing coins 56 times and note how often you get four or more heads or tails in a row—my experience of listening tests indicates that the listeners get disturbed very easily when presented with more than three such occurrences. They disbelieve the evidence of their own ears and start to guess, which is counterproductive, to say the least. If all that I was interested in was the results of guessing, I could run the test from the comfort of my home in Santa Fe, all the participants phoning in their scores from the comfort of their homes.

The test procedure is obviously single-blind in that, although the listeners (and I) couldn't know what they were listening to, Will did. This doesn't affect the validity of the results, however, Will taking great care not to give any visual clues.

It is also important to note that the listeners were not told that they would be subjected to seven tests. In my experience, there is a "last test" syndrome: whether due to relief or whatever, I have found people's scoring to randomize on the last test in a series. Though the scoresheet had eight tests marked on it, we therefore stopped the testing after the seven pieces of music. This was much to everyone's relief. To maintain the concentration required for this kind of test for the 45 minutes that it took is a terrific strain, as I am sure all those who took part will testify.

I would therefore like to express my belated thanks to all of you, as well as to Madrigal, AudioQuest, Krell, Music And Sound, Philips, Esoteric, Electronic Visionary Systems, and Listen Up, who loaned Stereophile the equipment for the tests, and to Richard Lehnert, Robert Harley, Allen St. John, Beth Jacques, and all the other Stereophile staff who distributed tickets and helped move people in and out of the listening room.

Footnote 1: There is also an important difference between double-blind testing as applied to wine and medical research—two areas examined in this month's "Letters" column—and audio. In the first two, the test directly examines the response of the test subjects to the stimulus; with audio, the test can only examine the responses of the subjects in an indirect manner, via their reactions to something which is itself multidimensional, unmeasurable, and by definition intended to elicit a complex subjective reaction—music. I do not find it coincidence that the most successful double-blind tests which I have organized or in which I have participated have used a non-musical, artificial test signal that does not vary with time. Note also Ken Pohlmann's CD player listening tests in the October 1990 issue of Stereo Review, where he found that the listeners were much better at identifying differences blind with a test tone than with music. I suggest that it is notthe masking effect of music that obscures differences, as is commonly held, but the fact that music itself changes sufficiently with time—isn't that its definition?—to render quick ABX tests much less sensitive than with unmusical tones.—John Atkinson

Footnote 2: At the time of writing, it is almost four months since I applied the Armor All to these discs. They so far don't appear to have a higher incidence of errors than their untreated siblings. I'll keep you posted if I notice any changes.—John Atkinson

Footnote 3: For many years the co-producer and co-presenter of Peter Sutheim's "In-Fidelity" program in Los Angeles (Radio KPFK, 90.7 FM, Sundays at noon). Will's professional career is in biomedical research, with a heavy involvement in controlled clinical trials.—John Atkinson

smargo's picture

my head around articles that were printed so long ago - especially when it comes to cd players and digital - Id rather see articles that peratin to the hear and now or some record reviews that arent in the magazine

Anton's picture

Time for a 'where are they now' follow up? (I know they still make it, I meant it in terms of current utility.)

Has improved technology negated the usefulness of this tweak?

Can newer measurement techniques find out what they did?

Smargo, this is still a current product. They are 'hear and now!'

la musique's picture

Silly me when 20 or so years ago I bought the pen and did most of my CDs
They now look so ugly and to be very honest, the stuff never made any difference to the sound.
I tried black markers, different green shade, and no difference.
I do have a good cd player(Audiomeca Mephisto M2 and to be honest the big difference is in the recording of the Cd and not what snake oil you can put on the plastic.

volvic's picture

Tried it at a store, did not buy it. Made no difference to sound and now that CD looks hideous. Thankfully was very skeptical with this tweak and didn't go full on with the few CD's I had back then.

dalethorn's picture

I'd imagine that CD players are better today, with better memory buffers that can make corrections in real time. In other words, I'd expect a good CD player to be able to match a bit-perfect CD rip with full error correction. I don't know if that was possible circa 1990.

jmsent's picture

had to make corrections in real time and were perfectly capable of doing so. It was by design. Look up "interleaving" and "Reed Solomon error correction" which were integral parts of the Redbook specification. Without those, the system couldn't work. The biggest advancement was the ability for the transports to track the disc without skipping, even in the presence of large scratches. By 1990, most decent CD transports were extremely capable in this regard.

dalethorn's picture

I guess my PC-based DVD/CD players weren't, even with error correction checked. I had a copy of Bowie's Diamond Dogs that sounded partly garbled on a couple of different players. So I ripped it with error correction enabled, and while it took a while to complete, the CD rip was perfect. That experience told me that real-time error correction would not handle every case, even if it worked 98 percent of the time. That's not to say that ripping could correct every case, but I think it gets closer to 100 percent.

Robin Landseadel's picture

This is pretty much in the rear window and without re-testing with the modern, post 24/192 and SACD DACs built into common audio gear, this article has zero contextual value. I don't think the "fractal" theory holds up. Below a certain audio level, those effects are masked by anything louder, and "louder" music happens to be the most popular flavor, audiophiles be damned.

What does count is that many sonic qualities that people like and desire in analog gear are obvious distortions. Compression is a distortion, a necessary one in audio production for domestic consumption. Many flavors of analog compression are preferred over digital iterations by audiophiles and nostalgists, but not by others. And the issues of low-level resolution, addressed initially by various tweeks, are now addressed with oversampling and better jitter performance. To these old ears, modern digital playback has gone through a quantum leap in audio quality. When I used green markers on CDs, digital sound was uniformly awful. As of now, I have every good reason to prefer digital reproduction over analog reproduction, and I do.

downunderman's picture

One funny thing for me is that Cd's with a black label have on average tended to sound better than one's that are predominately silver on the label side.

All anecdotal I know and maybe I am subconsciously pinning for vinyl, but there you go.