Sony SCD-1 Super Audio CD/CD player Direct Stream Digital

Sidebar 1: Direct Stream Digital

As described in Stereophile some years back (footnote 1), Direct Stream Digital is a new way of digital-encoding an analog signal, and was developed by Sony's engineers in order to archive Sony Music's priceless catalog of recordings. While DSD is fundamentally different from the conventional Linear PCM encoding used on CD and DVD-Audio, there is a connection at the A/D level, in that both involve oversampling delta-sigma converters (footnote 2).

The most straightforward way of encoding an analog signal as a Pulse-Code-Modulated (PCM) digital datastream is to use an A/D converter operating at the sampling frequency that puts out digital words of the desired length. If you are talking about the CD's 16-bit/44.1kHz data, an ADC samples the analog signal 44,100 times every second, each time describing the instantaneous signal amplitude to the nearest one of 65,536 voltage levels (2 to the 16th power). This is, in fact, how digital audio recordings were made up to the mid-'80s. But the complexity of the ADC increases almost exponentially with the number of bits required, and the critical demands made on the analog antialiasing filter needed to eliminate every trace of signal above half the sample rate are extreme. A different A/D paradigm was required to achieve resolution greater than 16 bits and to achieve more accurate 16-bit resolution at lower cost and circuit complexity.

Surprisingly, the optimal paradigm was first conceived of in 1946, but was not commercially implemented for audio until the 1980s, with the pioneering dbx 700 (footnote 3). Instead of trying to attain higher resolution by increasing the number of bits, it was thought: why not increase the sample rate instead? In the limit, if you increase the sample rate to a sufficently high frequency, you can use a 1-bit quantizer: a simple voltage comparator that outputs a "1" if the analog signal level is higher than it was at the previous sampling moment, or a "0" if it is lower. Because this "delta modulation" technique uses a sample rate very much higher than the baseband audio signal, the requirements for a "brickwall" analog antialiasing filter on the ADC's input can be relaxed. You can then either feed the high-rate pulse stream to a simple low-pass filter to reconstruct the analog original, or you can use a low-pass digital filter to "decimate" the low-resolution, high-sample-rate data to derive the desired multi-bit, low-sample-rate data.

In practice, the increase in sample rate to give sufficient resolution with a 1-bit quantizer is impractical. But it was pointed out in 1954 that if the quantizer was embedded in a feedback loop—in effect, the large quantization error from one sample would be subtracted from the next sample—the result would be a "shaping" of the signal's noisefloor, with increased baseband resolution being gained at the expense of much higher noise levels at very high frequencies. But as this noise could then be low-pass filtered, the improvement in resolution bestowed by "noise shaping" is virtually free from compromise and can be achieved at relatively low cost. This topology is called "oversampling delta-sigma modulation," though it's often incorrectly called "sigma-delta."

UltraAnalog introduced an 18-bit PCM ADC based on an oversampling delta-sigma module in 1988, to be followed by many other manufacturers, including Burr-Brown, Analog Devices, Crystal, and dCS. Progress since then has involved overcoming the stability problems associated with the use of higher-order filters in the feedback loop. This gives more aggressive noise-shaping, with benefit being greater baseband resolution, these days up to the equivalent of 24 bits! I would venture to say that these days, 99.99% of commercial digital audio recordings are made with delta-sigma ADCs. If you want an example closer to home, the soundcard in your PC uses one!

All of these ADCs use a digital decimation filter to convert the low-resolution, high-rate data to linear PCM. The elegance of the idea behind DSD is that this decimation filter can be eliminated. Why not, Sony's engineers thought, just store the output of a 7th-order noise-shaped delta-sigma modulator running at a very high frequency (in DSD's case, 2.8224MHz, or 64 x 44.1kHz) on an appropriate medium. For playback, this datastream could be fed, in theory at least, to a D/A converter consisting of just a simple low-pass filter.

The use of such a high sampling frequency would mean the ADC's analog antialiasing filter needn't be a brickwall type but could instead be a sonically benign low-order type; linearity would inherently be excellent; there would be no digital decimation filter, with its necessary mathematical approximations on either the A/D or D/A conversions reintroducing PCM quantization noise or time-domain dispersion problems; there would be no multi-bit DAC, with its possible performance compromises—this would be the closest thing to a digital topology with analoglike properties.

And that, in a nutshell, is Direct Stream Digital.

In practice, of course, things are somewhat more complicated. For playback in particular, aggressive noiseshaping would result in unacceptable levels of RF noise being fed to amplifiers and speakers, without additional filtering—in the SACD "Scarlet Book" Sony and Philips mandate use of a 100kHz low-pass filter in SACD mastering so that when the playback volume at standard DSD level is equivalent to 100W, the noise component "outside the audible sound spectrum" is 1W or less. In addition, SACD players must low-pass-filter their analog output above 50kHz. And it has been argued that the combination of a 2.8224MHz sample rate and 7th-order noiseshaping is not sufficient to give the equivalent information density of 24-bit/192kHz linear PCM data. See fig.1, however, which demonstrates that for the equivalent of 44.1kHz sampling, DSD offers close to 24-bit performance.

Fig.1 DSD encoding, spectral analysis, DC-200kHz, 1kHz tone at 0dBFS (log. frequency scale, FFT bin width 10Hz). After Sony.

But the proof of any audio pudding is in the hearing, and in that respect DSD-encoding would seem to be beyond reproach. Every Stereophile writer who has auditioned DSD under critical conditions—Robert Harley, Peter van Willenswaard, Jonathan Scull, and me—has found it both very much better than 16/44k1 CD and much closer to the analog experience.—John Atkinson

Footnote 1: See "Industry Update," Vol.19 No.1 (p.37) and No.5 (p.34), as well as Vol.20 No.9 (p.35).

Footnote 2: See Oversampling Delta-Sigma Data Converters, Candy & Temes (ed), IEEE Press, 1992.

Footnote 3: See Stereophile, Vol.10 No.5, August 1987.