PDM, PWM, Delta-Sigma, 1-Bit DACs Peter W. Mitchell

Peter W. Mitchell wrote about MASH DACs in January 1990 (Vol.13 No.1):

In October 1989, Technics flew a dozen North American hi-fi writers, including myself, to Japan for a busy week including seminars about MASH 1-bit digital decoding. The "1-bit" digital decoder, is suddenly appearing everywhere. In recent years, competition among makers of CD players has taken the form of "bit wars," the use of ever-higher numbers of bits to decode the CD. Linear 16-bit decoders led to pseudo–18-bit decoding, then to real 18-bit decoders, and now several companies claim to be providing 20-bit decoding. If you don't read brochures carefully you may also come away with confused impressions about 24-, 32-, and even 45-bit processing (in digital filters).

The assumption, of course, is that more must be better. Re-sampling digital filters follow the same rule: if 2x re-sampling is good, 4x is better, and many of this year's best players use 8x. Decoder chips can be multiplied as well: early CD players used a single decoder, switched between channels. Now most players use two decoders, one per channel, while the newest high-performance models often use four D/A chips, a back-to-back pair in each channel.

It is possible to find engineering logic behind each of these design choices. The best reason for using 18- or 20-bit decoding, or back-to-back pairs of DACs, is that it can reduce the effect of decoder nonlinearity, providing more accurate decoding of the 16-bit data on the CD. Furthermore, the interpolations involved in "oversampling" digital filters have the effect of turning the original 16-bit data samples into 18-bit or longer digital words; using an 18- or 20-bit decoder reduces the distortion and noise that would be caused by rounding off the longer words or decoding only the topmost 16 bits.

Such improvements actually are realized in some high-priced players. But in midprice players the bit wars are just a marketing contest, a way to gain a competitive advantage by making specifications look better. In some factories the use of 18-bit or back-to-back DACs has become another excuse for avoiding the costly individual MSB fine-tuning that is required to obtain truly linear low-level decoding. The result, 18 months after this "CD cancer" became widely known, is that midprice CD players continue to vary greatly in linearity from sample to sample, and a 20-bit 4-DAC model of one brand may perform less well than another maker's 16-bit 2-DAC player. In this environment, the "bit" rating is little more than fraud.

"1-bit" processing is a fundamentally different approach from decoding the digital signal—a method that promises both finer performance in the very best CD players and more consistent performance in low-cost models. But at first it is sure to add confusion. If 18 bits is allegedly better than 16, how can a 1-bit decoder be considered hi-fi at all?

Two players with 1-bit decoding, the Technics SLP-555 and SLP-222, have been on the market since last spring, but the inclusion of the new decoder was kept a secret because the company wasn't ready to deal with this question. The brochures for those players incorrectly described them as having normal decoders in back-to-back pairs. This deception was intended not only to avoid causing confusion among consumers but also to prevent a rebellion among retail salespeople, who like to have a simple, persuasive description of each product they're trying to sell. In a "more bits is better" environment, 1-bit decoding would be a hard sell. Technics chose to postpone publicity about 1-bit decoding until the new year, and inviting hi-fi writers to a factory seminar was part of the plan.

The name, "1-bit" D/A conversion, is part of the problem because it engenders confusion without explaining anything. Philips's preferred name, "Bit-stream" decoding, is less confusing but still doesn't tell you very much. Fundamentally, the operation of a bit-stream decoder is not difficult to understand.

To appreciate why it's a better idea, let's begin at the beginning. Digital signal processing is inherently precise because it involves only simple on-off switching. Switches are either on or off; the accuracy of the result is not affected by the precision of the electrical parts involved, nor by the temperature, or other factors. If you have a sufficiently large number of electronic switches, operated rapidly, any desired result can be obtained. This is how computers work. And if you have too few switches for exact computation, the errors are predictable; known errors can be compensated (canceled) or can be averaged out by switching much more rapidly. (The latter is the basis of "dithering" to remove quantizing distortion in low-level signals.)

Analog processing is inherently approximate and variable, because the result depends on the physical properties of the parts used. For example, every digital device (recorder, CD player, et al) requires an output filter to reconstruct a smooth waveform and remove the ultrasonic byproducts of the digital switching process. In the early days of digital audio, those filters were complex analog circuits containing a dozen or more capacitors, inductors, and resistors. An analog filter is basically a frequency-dependent voltage divider: the signal is attenuated at each frequency according to the ratio of impedances in the circuit. Since impedances of electronic parts are specified only approximately and often vary with temperature, the response of an analog filter can be predicted only approximately. Even with selected high-precision parts it is impractical to achieve exact response, and a few years ago every digital product had a slightly different response—a built-in, nonadjustable tone control. Analog filters also exhibited a potentially audible group delay (phase shift) at high frequencies.

Then designers adopted digital filtering. A digital filter operates by combining signals after many brief time-delays (typically a few millionths of a second); in this process, unwanted signals simply cancel out. The response is controlled by the mathematical design of the filter, and by the delay times (which are precisely regulated by a crystal oscillator). Consequently manufacturers can mass-produce digital filters at very low cost, all with exactly the same response, accurate to a few thousandths of a dB. As a bonus, since the internal delays are the same for every frequency, digital filters are phase-linear.

Virtually all new CD players use digital filters, not because they contain more accurate parts, but because accurate response is inherent in their design (regardless of parts quality). Initially digital filters are more costly to design, but in mass-production they are less costly to use because they are all identical; there's no need to measure each one, grade them for accuracy, or match response in pairs.

The same reasoning underlies the development of bit-stream decoders. The problem with a conventional digital/analog converter (DAC) is that its operation involves mainly analog processes and is therefore approximate. A 16-bit DAC contains a precision current source and an array of 16 switches. Each switch is connected to a resistor, and the resistors are supposed to be scaled in exact 2:1 ratios so that each switch, when opened, will contribute exactly twice as much current to the output as the switch below it. The switches are controlled by the 16-bit codes from the CD; thus by opening and closing in various combinations, a total of 65,536 different output values can be generated.

But the topmost switch (the most-significant bit, or MSB) contributes 32,768 times as much current as the least-significant bit (LSB). If the MSB current is in error by as little as one part in 32,768, the effect of the LSB is swamped. In most CD players it is; few 16-bit DACs operate to better than 15-bit accuracy. The practical result is that most CD players are non-linear at very low signal levels, reproducing small signals at the wrong levels and with added distortion. Keep in mind that this problem arises not from the digital code itself but from small errors in an analog quantity—the current produced by the DAC for the several most-significant bits.

For comparison, imagine that you were assigned to fill a bucket with a known amount of water, using measuring cups varying in size from one ounce to 64 ounces. Even if you use care in filling the largest cup, it might contain 63.7 or 64.5 ounces instead of 64; you can't be sure that it contains exactly 64 times as much water as the smallest cup. But there is a way to obtain an exact result: use only the one-ounce cup, and transfer its contents to the bucket 64 times. The capacity of the cup may not be exactly one ounce, but as long as you fill it the same way each time, the total amount transferred will be proportional to the number of refills—an exactly linear relationship. This is the idea behind 1-bit decoding. In place of a method whose result depended on slightly uncertain analog quantities (the currents in the DAC), we have adopted a simple counting scheme—a purely digital process.

Of course with a small cup you'll have to work fast, but in modern digital electronics that's not an obstacle. In the Philips bitstream decoder, the output stage generates around ten million pulses per second, the exact rate being determined by the digital code. (This is called "pulse density modulation," or PDM.) A simple analog filter averages out the pulses to form the final analog output signal.

In all of the Japanese 1-bit decoders announced to date, the output stage is a pulse-width modulation (PWM) circuit of some type. In a PWM system the output signal is an on/off waveform in which the analog voltage is represented by the duration of the pulses, ie, the percentage of time the waveform remains in the "on" state. This is analogous to filling the bucket, not with a cup, but with a hose whose high-precision valve allows the water to flow in precisely timed bursts. When we want a larger amount of water, we use wider pulses (longer bursts).

The Technics MASH (multistage) decoder uses pulses of 11 different durations to form the output signal. The timing circuit that controls the pulses operates at a frequency of 33.9MHz, or 768 times higher than the 44.1kHz sampling rate of the digital codes in the CD. The transformation of the CD's original PCM signal into the final PWM waveform is determined mathematically and is accomplished entirely in the digital domain. In principle this can be done to any desired degree of accuracy, preserving all of the information in the original 16-bit code.

Summing up: to obtain exact frequency and phase response, manufacturers abandoned analog filters whose performance depended on inexact circuit impedances, and adopted digital filters whose response is controlled by mathematical operations and precisely timed delays. Now, to obtain consistently exact decoding of low-level signals, they intend to abandon conventional DACs whose accuracy is affected by uncertain analog quantities (currents flowing through resistors of slightly inexact value), and replace them with bitstream decoders whose accuracy, again, is determined by mathematics and timing (the number and duration of pulses).

The essential point is that the performance of a bitstream decoder, like that of a digital filter, depends on its design and is not expected to vary from sample to sample. Unlike PCM decoders, there is no need to quality-grade the chips for accuracy, nor to fine-tune the performance on the production line. Thus the bitstream decoder brings closer the day when CD players, too, can be assembled by robots with no need for individual adjustment or testing.

Conventional current-summing DACs also require a current/voltage conversion stage, which can be a source of slewing-induced distortion, plus a deglitching circuit to suppress the "glitch" (the high-current spike) that occurs when several bits change in imperfect synchrony. A bitstream decoder needs neither.

Stereophile readers have already seen an example of how good 1-bit decoding can be, in Larry Greenhill's review of Sansui's AU-X911DG integrated amplifier (November 1989, pp.144–150). The amplifier's integral D/A converter, called "LDCS" by Sansui, is actually a third-generation Technics MASH chip. LG loved its sound, while Robert Harley measured its linearity as "exceptionally accurate, among the best I have measured...nearly a perfect straight line."

You might reasonably suppose that, while introducing a significant technological advance, manufacturers would present a united front in communicating the benefits of the new approach to consumers. No such luck. A forthright presentation of the advantages of 1-bit decoding would require admitting how variable the performance of previous and current players has been. Besides, manufacturers like to promote the alleged uniqueness of their designs: they are launching 1-bit technology with a dizzying array of jargon aimed at making each version seem unique.

Philips, the first to go public with the new system, calls its version a Bitstream decoder process and uses a pulse density modulation (PDM) output circuit. Technics, which claims to have been working on 1-bit decoding since 1986 but is only going public with it now, calls its process MASH and uses a pulse-width modulation (PWM) output circuit. Harman/Kardon is using the Technics MASH decoder in two new CD players but confused many observers by calling it a "bitstream" decoder and comparing its performance to the Philips circuit. Sansui, as noted earlier, uses the Technics MASH chip in its Vintage series CD player and integrated amplifier, but calls it "LDCS." Sony appears to be using the Philips PDM circuit in several CD players marketed overseas (but not yet in the US), calling it a "High Density Linear Converter."

All of the new 1-bit decoders contain a "noise-shaping" digital filter that suppresses hiss, enhancing the S/N ratio, hence the resolution. Technics' trade name for its decoder is a quasi-acronym for this filter: MultistAge noise SHaping (MASH). The MASH chip that has been available since last spring is a third-generation design with a claimed S/N ratio of 108dB. Sony recently announced a new decoder using Sony Extended Noise Shaping (SENS) to achieve a claimed S/N ratio of 118dB. Not to be outdone, JVC announced a chip that uses PEM (pulse-edge modulation, a sort of one-sided PWM) and VANS (Victor Advanced Noise Shaping) to achieve 120dB. At its seminar for North American hi-fi writers, Technics capped this game of corporate one-upmanship by announcing that its third-generation chip will be used only in midprice players; the company's best players will contain a new fourth-generation MASH chip rated at 123dB.

Note that these specifications apply only to noise generated in the playback process; since virtually no CD has been recorded with a S/N ratio better than 90dB, these claims won't be realized with real recordings. (The measurement is made using a special test CD recorded with an all-zeroes code, with no dithering.)

But to demonstrate the superb linearity of the fourth-generation MASH decoder, Technics conducted a play-off comparing its newest player with current Denon and Sony models using 18- and 20-bit DACs. It was no contest; in the dithered glide tone from –60 to –120dB on the CBS test disc, the Sony produced audible distortion and the Denon generated obvious noise modulation due to nonlinearities in the DACs. (To be fair, these may have been worse-than-average samples off the production line.) The playback of this track by the Technics was the best I've ever heard, with no audible imperfection.

What appeals most to my Yankee soul is that this performance came from a decoder that is actually less costly to produce than a conventional DAC. MASH chips, or the equivalent from other manufacturers, can be used in CD players at virtually every price level. (A low-power version for portables hasn't been developed yet, but will be.) Within a couple of years, 1-bit decoders could be in every new CD player; then the cancer of nonlinear decoding will have been banished.

I don't want to leave the impression that all 1-bit decoders are alike in their performance or sound. There have been many rumors that the original Philips Bitstream decoder was not designed to leapfrog ahead of the best conventional DAC performance, but is just a way of obtaining consistent linearity in low-cost players. Further rumors suggest that Philips is working on a high-performance Bitstream decoder for introduction next year.

But the picture became confused at the British Penta hi-fi show in September, where an A/B comparison carried out by reviewer Paul Miller apparently persuaded many listeners that the present Philips Bitstream decoder sounds better than the best 18- and 20-bit conventional DACs. A friend of mine who heard the Penta demonstration examined the demonstration setup afterward; evidently the CD players were not accurately matched in level, and the comparison may have been invalid. Martin Colloms, writing in HFN/RR, added that in his own listening tests the present Philips circuit is a good mid-level performer but not equal to the best linear DACs.

Two weeks after my visit to Japan, the potential of 1-bit decoding was confirmed in a paper written by British mathematician Michael Gerzon for the New York convention of the Audio Engineering Society. In Gerzon's absence it was introduced and summarized by Stanley Lipshitz, who called it a very important paper (footnote 11). It is a mathematical analysis of the noise-shaping that is a central part of MASH and other 1-bit decoders, showing that with appropriate selection of the noise-shaping filter function, the effective dynamic range of CD playback can be increased by about 11dB, or nearly two bits' worth.

The actual limitation now lies at the recording end of the signal chain, with the nonlinearities and quantizing distortion in the A/D converters used in professional digital recorders. Gerzon's paper shows, and the Technics demonstration confirms, that if the recorded signal is correctly dithered to eliminate quantizing distortion, it is possible to record—and accurately resolve in playback—signals much smaller than the least-significant bit. (In theory this is also true with a conventional DAC, but only if it is precisely adjusted for good linearity, which real DACs usually aren't.) So while the CD is only a 16-bit storage medium, it is capable of 18-bit effective resolution and dynamic range. At the AES convention a designer of high-performance oversampling A/D converters told me that Sony will soon introduce a successor to its PCM-1630 CD mastering recorder, employing those A/D converters. Then the recent improvements in player design will really pay off.—Peter W. Mitchell



Footnote 11: "Optimal Noise Shaping and Dither of Digital Signals," Michael Gerzon and Peter G. Craven, AES Preprint 2822. Preprints are available from the Audio Engineering Society, 60 East 42nd Street, New York, NY 10165. Web: www.aes.org.

COMMENTS
Allen Fant's picture

Excellent time-trip down memory lane. I was hooked on CD in 1988.
I became hooked on SACD in 2001.

Dr.Kamiya's picture

Thank you for posting this! Love reading about the old stuff.

Out of curiosity, has Stereophile ever done a review of an Elcaset deck? Saw an article about it on another site and now I'm curious if you guys ever measured one.

hollowman's picture

... is a much-modified Philips CD-60 (built early 1990).

it uses a TDA1541A (double crown) / SAA7220/B combo noted in this article.

No way have I heard the best digital playback gear avail today (mid. 2017).
I'm sure the latest Meridian, PS Audio or Chord DACs will provide v. good sound.

I have heard (and/or own) some Musical Fidelity DACs from a few years ago.
Also have the DiyParadise TDA1545A-based non-oversampling Monica DAC.
My modded CD-60 easily beats those.And, in some ways, also surpasses my VPI-HW19/Rega RB300/Sumiko BP Special analog rig.

I think Philips SENSIBLY used their R/D funds and facilities when they came up with those now-classic digital ICs.

IMPLEMENTATION is very important. So even the poor-sounding Bitstream/MASH DACs could sound okay if the PSU was decent and the output section was discrete.

X