PASC & Philips' DCC More on PASC from Peter W. Mitchell

More on PASC from Peter W. Mitchell

The digital input filter in the DCC's PASC encoder divides the audio spectrum into 32 sub-bands of equal width; but the initial announcement did not specify whether they are equal on a linear or logarithmic scale. I naturally assumed a logarithmic scale, which would make each sub-band about a third of an octave wide, as in an audio spectrum analyzer. But my Boston colleague E. Brad Meyer learned that they are equal on a linear scale.

To clarify the point I called Gerry Wirz, project manager for DCC, at his office in Holland. He confirmed Brad's information. Dividing a total bandwidth of 22kHz by 32 yields a bandwidth of about 700Hz for each sub-band. All frequencies from 20Hz to about 700Hz are in the first band, which means that the PASC encoder is not making any use of masking thresholds at low frequencies. It also fails to take advantage of the steep rise in the threshold of hearing below 100Hz; but that may not matter, since coding low frequencies doesn't use up many bits anyway. Philips may have chosen this approach to preserve the phase integrity of the signal. The ear is known to be very sensitive to phase below about 700Hz and relatively insensitive to phase above 1500Hz.

At any rate, most of the sub-bands are located at high frequencies, spaced only 700Hz apart. Thus in the top octave there would be a band from 10.0 to 10.7kHz, one from 10.7 to 11.4kHz, and so on. With so many bands in the high-frequency range it is easy to understand why at any moment there might be strong musical overtones in only a few of them, leaving most of the others empty. And a great advantage in efficiency would be gained from the moment-to-moment reallocation of unused bits—borrowing bits from sub-bands that contain no energy above the dynamic masking threshold, and assigning those bits to sub-bands that contain strong overtones.

While I was thinking about how the system might work, it occurred to me that a dramatic reduction in bit rate could be obtained by taking advantage of the Nyquist Theorem, a principle that was enunciated back in 1928. Essentially it says that any signal can be accurately characterized by a sampling system whose frequency is equal to twice the bandwidth of the signal. In a single-band system like the CD or R-DAT, the bandwidth is the same as the highest frequency, and a 44kHz sampling rate is used to record a maximum frequency of about 22kHz.

Since I was assuming that the sub-bands were a third of an octave wide, I figured that the maximum sampling rate would be used for high-frequency bands, while the bands below 100Hz might be re-sampled at only 200 samples per second. As it turned out, I was right about the principle but wrong about the details.

In the DCC, each sub-band is only about 700Hz wide; so a sampling rate of only 1.4kHz is sufficient for accurate reproduction of the signal in every sub-band. Wirz didn't tell me what the actual re-sampling rate is; perhaps this is still a trade secret. But he did say that the system takes advantage of the Nyquist Theorem to re-sample the data in each sub-band at a relatively low rate.

At first glance this seems troubling, perhaps downright impossible. It's easy to understand how a digital system can record a waveform by sampling it at several points during each cycle. But can you record a 10kHz tone using a sampling rate of only 1400Hz? After you capture one sample, seven complete cycles of the waveform will pass before you take the next sample. How can you hope to define the waveform accurately if you sample it at only one point in every seventh cycle?

The key to this conundrum lies in the central fact that any sampled system must also have both an input filter (often called the anti-aliasing filter) and an output filter. The input filter defines the bandwidth of the signal and therefore sets the required sampling rate. The output filter has the critically important role of reconstructing a continuous waveform from the discrete samples. Thus the output filter is not just an incidental "smoothing" filter; it plays an essential role in making the sampling theorem work.

Specifically, in the case of a 10kHz signal, remember that it falls in a sub-band whose entire bandwidth extends from 10.0kHz to 10.7kHz. Anything outside this range, such as harmonics or overtones that might give the waveform a different shape, were eliminated from this sub-band by the input filter. (They are handled by other sub-bands, and the complex shape of the original input waveform will be reconstructed when all of the sub-bands are recombined in playback. This concept comes from the Fourier Theorem, which says that any continuous waveform, no matter how complex, is the sum of a series of sinewaves at several frequencies.)

Therefore, as far as the 10kHz sub-band is concerned, any signal that falls within this narrow band must be a sinewave. The only question is what its frequency is: 10.0, 10.2, 10.7, or whatever. That can easily be determined by recording one sample every seventh cycle. There is a direct mathematical correspondence between the frequency and the point in the cycle where the second and later samples are taken.

These ideas are unfamiliar, but they are just a different application of the same information theory that forms the foundation of all digital recording. As far as I know this is the first use in consumer audio of filters whose bandwidths in Hertz are small compared to the frequencies in the band—which leads in turn to the strange notion of sampling less than once per cycle.

In summary, any complex waveform above 700Hz is effectively Fourier-analyzed by the digital input filter, which breaks it down into an assortment of sinewaves at various frequencies. Each frequency component falls into a different sub-band. Each sub-band is so narrow that it can contain only sinewaves, and a low sampling rate is adequate to define the exact frequency of each sinewave.

Earlier in this article, Peter van Willenswaard and I described the DCC as a system that records digital audio with an average of only four bits per sample. This picture is valid if we think only of the 44kHz sampling rate of the 16-bit PCM input circuit. We know that the system's data rate is 384,000 bits per second for stereo, or 192,000bps per channel. Dividing that by a 44kHz sampling rate yields an average sample size of 4.3 bits.

But the process appears in a different light when we see that the narrow bandwidth of the 32 sub-bands allows the signal in each sub-band to be re-sampled at a much lower rate, perhaps only 1.4kHz. Dividing the data rate of 192,000bps per channel by a sampling rate of 1.4kHz yields a total "bit pool" of 137 bits per sample in the 32 sub-bands. If all 32 were active, the sample size would indeed be only 4.3 bits for each sub-band. But if two-thirds of the sub-bands are empty at a given moment, 13 bits per sample can be assigned to each active band.

In playback the slow-sampled code for each sub-band is used to determine the exact sinewave frequency of the signal in that band. Then the sinewaves from all active bands are combined to produce the harmonically complex output waveform. This is all done in the digital domain: the composite code from all of the sub-bands is oversampled and digitally refiltered to generate a conventional 16-bit PCM code that, after D/A conversion, yields the analog output. It's an amazingly complex and sophisticated design, but that's what it takes to deliver something like CD-quality sound using only one-fourth as many bits per second.—Peter W. Mitchell

COMMENTS
Archimago's picture

Kudos Stereophile for the writing all those years back! Great to see the writers really dig into the story and provide the technical details for those curious about the new technology wishing to understanding how these things worked...

A couple of thoughts as they do still resonate through all these years:

1. Lossy audio. Though nobody considers it "perfect" audio, clearly it sounds good. Remember that PASC is essentially MPEG-1 Layer 1 audio at 384kbps. This type of audio compression has been easily superseded by Layer II and III since. As such, a modern MP3 (Layer III) at 320kbps with modern psychoacoustic modeling is significantly more accurate than PASC ever was. Remember, despite all the negatives thrown at MP3 and lossy encoding, save for awful encoding software and low bitrate 128kbps files back in the day, though not preferred if lossless is available, it's important for folks not to be worried about 256kbps AAC (iTunes) or modern 320kbps MP3 music. For portable players where clearly walking around with headphones or riding in a subway to work, surely something like hi-res lossless is a total waste of time and money.

2. Great job Philips for setting up actual A/B comparisons. And even going through the extra mile of rewiring for L-R listening! The obvious modern analogy of a company selling an encoding scheme is with Meridian and MQA. So where is this level of openness with MQA these days? Where is the detailed A/B comparison of MQA? And where are the journalists willing to critique the claims with a proper demo?

Again great article from the archives. But it's also a sad reminder of how far the level of discourse has fallen these days in the audiophile media.

John Atkinson's picture
Archimago wrote:
Kudos Stereophile for the writing all those years back!

Thank you. I am gradually working my way through all the great stuff in our archives. The photo, BTW, is of DCC tapes I have in my own collection, though I no longer have any means of playing them.

Archimago wrote:
The obvious modern analogy of a company selling an encoding scheme is with Meridian and MQA. So where is this level of openness with MQA these days? Where is the detailed A/B comparison of MQA?

I report on a detailed series of A/B comparisons, both with commercial recordings and with some of my own that I had mastered in MQA, in the September issue. You can also find comparative testing at www.audiostream.com/content/mqa-reviewed.

John Atkinson
Editor, Stereophile

X