Squeezing the Music 'til the Bits Squeak

In this space last January, I enthused about the sound of linear 20-bit digital recordings which, I felt, preserve the quality of a live microphone feed. "I have heard the future of audio—and it's digital!" I proclaimed, which led at least a couple of readers to assume I had gone deaf. Putting to one side the question of my hearing acuity, 20-bit technology has been rapidly adopted in the professional world as the standard for mastering. The remaining debate concerns how to best preserve what those 20 bits offer once they've been squeezed down to the 16 that CD can store. Sony's Super Bit Mapping algorithm and Harmonia Mundi Acustica's redithering device have been joined by new black boxes from Apogee Electronics, Lexicon, and Meridian; it appears likely that, in next to no time at all, all CD releases will be offering close to 20-bit resolution—at least in the upper midrange, where the ear is most sensitive.

Other than CD, however, the future of high-quality consumer audio looks grim. The LP is all but dead, and there is huge pressure from the cable TV and cinema industries to abandon linear PCM audio, with its profligate demands for data transmission bandwidth, and replace it with various kinds of compressed digital audio. Already we have two data-compressed media: DCC from Philips and Matsushita, and MD from Sony. MD is also being promoted as a smaller, cheaper replacement for the radio industry's cart tape machines, while the DMX cable radio system—and its European equivalent, mentioned by Ken Kessler in this month's "Industry Update"—use a data-compression scheme developed by Dolby Labs.

Dolby's AC-3 data-compression algorithm is also an important ingredient in both their Dolby Stereo Digital film sound system and in the domestic version, Dolby Surround Digital. DTS, the competing, lower-cost digital film sound system, also uses data compression, this time an algorithm developed in Northern Ireland by APT. Heck, even the Sound Blaster card in my PC offers real-time music data compression as an optional extra.

I'm not saying that all such compression is bad. In computers, where every bit is sacrosanct, programs such as Stacker and MS-DOS 6.0's Double-Space work by packing the data more efficiently. If, as happens in low-level music, each 16-bit word for quite a while starts with 00000000, it saves a lot of space by recoding this as "8 0s." This "lossless" compression gives up to a 50% reduction in the space or transmission bandwidth required by the data. All the audio data-reduction algorithms use some kind of more efficient packing, although there's some argument about how much. It's nice to be able to double the size of your hard disk; but for audio, even a 2:1 compression ratio is not nearly enough, given the limited bandwidth available for two-channel broadcasting or multi-channel film sound.

One channel of linear 16-bit PCM demands a capacity of 705.6 kilobits per second; DCC's PASC, 192kb/s; DTS's APT X-100, 175kb/s; MD's ATRAC, 141kb/s; Dolby's AC-3, 64kb/s. To achieve these huge compression ratios requires the use of "lossy" data reduction—ie, the actual throwing-away of information. It's assumed by those who design them that, if these algorithms have been properly implemented with the right psycho-acoustic assumptions, the musical information represented by the lost data will not be missed by most listeners.

That's a mighty big "if." Depending on what information is discarded, the process can either be transparent, or severely degrade the original. Most who have heard the various schemes feel that PASC is pretty transparent (although DCC is effectively dead, to judge from the inactivity at EMI's Illinois duplication plant and the fact that Philips's flagship DCC-900 recorder is a close-out special in the latest Damark catalog), while ATRAC and Dolby AC-3 are not yet 100% transparent—and might never be!

The main problem with all data-reduction algorithms is that, in order to work, they must dispense with the hitherto fundamental idea of preserving the shape of the musical waveform. The continuous waveform is split into discrete segments of time, within which all timing information is sacrificed in order to perform the necessary frequency analysis. This is why repeating the algorithm results in further degradation: the chances of the reconstructed music data being split up into exactly the same time slices is infinitesimal.

If all future music carriers except one, the CD, are to be compromised, then the question is raised as to whether the High End has a future. If the source signal's time qualities are compromised, then why should a Jim Thiel or a John Dunlavy continue to develop more and more time-coherent loudspeakers? If the source signal's residual errors are above the threshold of audibility, then why should electronics designers develop amplifiers that are any better than this?

Some would point out that at least Dolby's AC-3 will be offered in conjunction with five full-range channels of information rather than two—what we lose in signal purity we might gain in spatial envelopment. And of this magazine's contributors, Corey Greenberg, J. Gordon Holt, and Peter W. Mitchell have all gone on record saying that they think that tradeoff acceptable.

I'm not convinced. For the first time in audio history, the momentum at the cutting edge of technology seems to be toward serious sonic compromise. A domestic digital system which, at best, is only just good enough, and at worst miserably fails to live up to expectations, is in danger of being supplemented with one that is deliberately compromised in quality for commercial reasons, with almost no hope of it getting significantly better.

Am I being too pessimistic? I don't think so. The topic of data reduction has received significant coverage in Stereophile. This has not been because we regard ourselves as a "mid-fi" magazine, as has been stated elsewhere, but because we feel the dangers of data reduction should be both understood and taken more seriously by the High End community than they currently appear to be.