Lossy Compression: the Sonic Dangers Page 2
His repeated demeaning references to audiophiles revealed a bitterness that went far beyond holding an opposing point of view. An example: "When the term 'audiophile' replaced 'hi-fi freak,' I immediately thought of necrophiles [sic] and pedophiles. Perhaps I wasn't far off."
Once the conference started, I was particularly interested in a series of presentations on digital audio data compression, a technique also called bit-rate reduction. These are digital audio encoding schemes in which the number of bits used to represent an analog audio signal is substantially reduced. Some such schemes result in less than a tenth the amount of data used to encode music on CDs. This is accomplished by using more efficient coding techniques, but also by throwing out "inaudible" musical information. In theory, the correctly coded wanted signal will mask (cover) the huge errors resulting from encoding with so few bits.
The word "compression" is somewhat misleading. It is possible to compress computer data for storage and later decompress it with no loss of information, but digital audio bit-rate reduction systems are data-elimination techniques that choose which information to discard based on a psychoacoustic masking model.
Bit-rate reduction systems are the basis for many future audio formats including Philips's Digital Compact Cassette (DCC), Sony's Mini Disc (MD), and Digital Audio Broadcasting (DAB), a system destined to replace AM and FM radio. There is also talk about using bit-rate reduction systems during the making of master recordings. The motivation to reduce the amount of digital data representing the music signal boils down to economics and convenience: transmission and storage systems need far less bandwidth (ie, cost) than is required by conventional encoding techniques. Similarly, the physical limitations of formats like DCC and MD dictate using lower data rates.
The drive to implement these schemes is of paramount interest to the audiophile and music lover: What do they do to the music?
It was thus with great interest that I attended the conference's sessions on bit-rate reduction systems. The six-hour session consisted of an overview of fundamental principles of bit-rate reduction and how they will be used in broadcasting, technical presentations on four competing bit-rate reduction systems by their respective designers, and the results of the officially sanctioned subjective testing of lowbit-rate encoders.
The first presenter set the afternoon's tone: Whatever the subjective effects of reducing the bit-rate, they are more than offset by the economic advantages, he implied. Among the first benefits cited was that more audio channels could be read simultaneously from professional hard-disk recording systems. (Hard disks are rapidly replacing tape recorders in the studio.) This statement, made seconds into the presentation, confirmed my worst fears: Low bit-rate encoding will be used to make master recordings and not be confined to low-end consumer products. The paper's text is explicit in this regard: "There are also potential applications for bit-rate reduced audio in recording, where a lower data rate per channel may allow a longer recording time or a greater number of channels on a disk..."
Although the presenter mildly cautioned about professional uses of bit-rate reduction, he nevertheless said that the "effects on sound quality are not that great" at 64kb/s (kilobits per second per monaural channel), a rate 1/11th the CD's 705kb/s data rate. He also cited research that indicated "90% [of the musical information] may be discarded," and repeatedly used the word "transparent" in describing the effects of these systems on the musical signal.
He also related his comparison of the coded signal with that of the ignored signal: "In passing it should be noted that the sound of the material which is rejected during data compression (that is, that which lies under the masking threshold) is most interesting to listen to, since one would expect that 85% of the original signal would sound rather more significant than it in fact does. The redundant material [!RH] has a sound very much like pumping broadband noise, following the rhythmic pattern of the original audio signal, but is really fairly inconsequential when auditioned in comparison to the transmitted 15% or so which appears to be all that is required for transparent coding" (emphasis added).
After that introduction to data compression, the next four presentations outlined the specific technical details of four different approaches. The first was on apt-X100, a broadcasting encoder that "...aims to code transparently and in real-time very high quality digital audio signals at 4 bits per sample." Its designer stated a fundamental principle of his system: "Periodic waveforms are redundant and don't need to be encoded." The apt-X100 is currently used worldwide in studio-transmitter microwave links.
Louis Fielder of Dolby Labs then presented the technical aspects of Dolby's AC-2 family of lowbit-rate encoders. In contrast to all the other statements made about bit-rate reduction, Mr. Fielder took a more cautious approach to these systems. He said, "No bit-rate reduction system is transparent and no one should use that word. No audio component is transparent16-bit linear PCM [encoding] isn't transparent."
Further, he suggested that there are "more limitations than knowledge in our understanding of masking theory," and that single-tone masking experiments are not sophisticated enough to draw conclusions about human musical perception. Reflecting these ideas, the AC-2 encoder uses a much more conservative masking model in deciding what musical information gets encoded and what doesn't. This more cautious approach was also revealed in Louis Fielder's perceptive comment: "These systems tend to be judged on their worst-case performance rather than their average performance."
A few weeks before the conference, Dolby made an AC-2 system available to Meridian's Bob Stuart, who performed critical listening evaluations. During a break in the conference, Bob described to me his experiences with the AC-2. The system reportedly is better sonically than one would expect, rather than worse. Bob said that if you walked into a room with music playing that had been processed by the AC-2, you wouldn't know it; only in comparison with the source was the AC-2 identifiable. Further, the changes in the signal rendered by the AC-2 are indescribable since the distortions it introduces bear no relationship to the kinds of changes for which there is an existing audiophile lexicon. A new vocabulary needs to be developed to describe what bit-rate reduction systems do to the music. I was encouraged by Bob's comments, and would like the opportunity to spend some time with the AC-2 myself. The AC-2 decoder will soon be available on a (stereo) chip for seven to ten dollars.
The next presentation described MUSICAM, a form of which is the basis for PASC encoding used in Philips's DCC. The MUSICAM system has been adopted as the standard by the Eureka 147 partners for DAB, the system destined to replace AM and FM radio transmission.
MUSICAM can operate at a variety of "compression" rates, from 4:1 (192kb/s) to approximately 11:1 (64kb/s), with 128kb/s per channel chosen for DAB. If Louis Fielder was the most conservative and cautious about bit-rate reduction, Yves-François Dehery, a designer of MUSICAM, was the least. He seemed the most zealous in getting data rates even lower, using terms like "informational irrelevance" and "psychoacoustic redundancy." He also believes that "92% of the signal is redundant," and suggested that MUSICAM is "suitable" for encoding original master recordings in digital audio workstations.