Lossy Compression: the Sonic Dangers Page 2

The audience laughed uproariously at this joke and his many other derisive attacks on those who listen to judge audio equipment quality. After these comments, he attempted to disparage the integrity of audio magazines with this remark: "You can only say that ['bits is bits'] once, which is a problem if you have to publish a hi-fi magazine every month. It leaves an intellectual vacuum."

His repeated demeaning references to audiophiles revealed a bitterness that went far beyond holding an opposing point of view. An example: "When the term 'audiophile' replaced 'hi-fi freak,' I immediately thought of necrophiles [sic] and pedophiles. Perhaps I wasn't far off."

Once the conference started, I was particularly interested in a series of presentations on digital audio data compression, a technique also called bit-rate reduction. These are digital audio encoding schemes in which the number of bits used to represent an analog audio signal is substantially reduced. Some such schemes result in less than a tenth the amount of data used to encode music on CDs. This is accomplished by using more efficient coding techniques, but also by throwing out "inaudible" musical information. In theory, the correctly coded wanted signal will mask (cover) the huge errors resulting from encoding with so few bits.

The word "compression" is somewhat misleading. It is possible to compress computer data for storage and later decompress it with no loss of information, but digital audio bit-rate reduction systems are data-elimination techniques that choose which information to discard based on a psychoacoustic masking model.

Bit-rate reduction systems are the basis for many future audio formats including Philips's Digital Compact Cassette (DCC), Sony's Mini Disc (MD), and Digital Audio Broadcasting (DAB), a system destined to replace AM and FM radio. There is also talk about using bit-rate reduction systems during the making of master recordings. The motivation to reduce the amount of digital data representing the music signal boils down to economics and convenience: transmission and storage systems need far less bandwidth (ie, cost) than is required by conventional encoding techniques. Similarly, the physical limitations of formats like DCC and MD dictate using lower data rates.

The drive to implement these schemes is of paramount interest to the audiophile and music lover: What do they do to the music?

It was thus with great interest that I attended the conference's sessions on bit-rate reduction systems. The six-hour session consisted of an overview of fundamental principles of bit-rate reduction and how they will be used in broadcasting, technical presentations on four competing bit-rate reduction systems by their respective designers, and the results of the officially sanctioned subjective testing of low–bit-rate encoders.

The first presenter set the afternoon's tone: Whatever the subjective effects of reducing the bit-rate, they are more than offset by the economic advantages, he implied. Among the first benefits cited was that more audio channels could be read simultaneously from professional hard-disk recording systems. (Hard disks are rapidly replacing tape recorders in the studio.) This statement, made seconds into the presentation, confirmed my worst fears: Low bit-rate encoding will be used to make master recordings and not be confined to low-end consumer products. The paper's text is explicit in this regard: "There are also potential applications for bit-rate reduced audio in recording, where a lower data rate per channel may allow a longer recording time or a greater number of channels on a disk..."

Although the presenter mildly cautioned about professional uses of bit-rate reduction, he nevertheless said that the "effects on sound quality are not that great" at 64kb/s (kilobits per second per monaural channel), a rate 1/11th the CD's 705kb/s data rate. He also cited research that indicated "90% [of the musical information] may be discarded," and repeatedly used the word "transparent" in describing the effects of these systems on the musical signal.

He also related his comparison of the coded signal with that of the ignored signal: "In passing it should be noted that the sound of the material which is rejected during data compression (that is, that which lies under the masking threshold) is most interesting to listen to, since one would expect that 85% of the original signal would sound rather more significant than it in fact does. The redundant material [!—RH] has a sound very much like pumping broadband noise, following the rhythmic pattern of the original audio signal, but is really fairly inconsequential when auditioned in comparison to the transmitted 15% or so which appears to be all that is required for transparent coding" (emphasis added).

After that introduction to data compression, the next four presentations outlined the specific technical details of four different approaches. The first was on apt-X100, a broadcasting encoder that "...aims to code transparently and in real-time very high quality digital audio signals at 4 bits per sample." Its designer stated a fundamental principle of his system: "Periodic waveforms are redundant and don't need to be encoded." The apt-X100 is currently used worldwide in studio-transmitter microwave links.

Louis Fielder of Dolby Labs then presented the technical aspects of Dolby's AC-2 family of low–bit-rate encoders. In contrast to all the other statements made about bit-rate reduction, Mr. Fielder took a more cautious approach to these systems. He said, "No bit-rate reduction system is transparent and no one should use that word. No audio component is transparent—16-bit linear PCM [encoding] isn't transparent."

Further, he suggested that there are "more limitations than knowledge in our understanding of masking theory," and that single-tone masking experiments are not sophisticated enough to draw conclusions about human musical perception. Reflecting these ideas, the AC-2 encoder uses a much more conservative masking model in deciding what musical information gets encoded and what doesn't. This more cautious approach was also revealed in Louis Fielder's perceptive comment: "These systems tend to be judged on their worst-case performance rather than their average performance."

A few weeks before the conference, Dolby made an AC-2 system available to Meridian's Bob Stuart, who performed critical listening evaluations. During a break in the conference, Bob described to me his experiences with the AC-2. The system reportedly is better sonically than one would expect, rather than worse. Bob said that if you walked into a room with music playing that had been processed by the AC-2, you wouldn't know it; only in comparison with the source was the AC-2 identifiable. Further, the changes in the signal rendered by the AC-2 are indescribable since the distortions it introduces bear no relationship to the kinds of changes for which there is an existing audiophile lexicon. A new vocabulary needs to be developed to describe what bit-rate reduction systems do to the music. I was encouraged by Bob's comments, and would like the opportunity to spend some time with the AC-2 myself. The AC-2 decoder will soon be available on a (stereo) chip for seven to ten dollars.

The next presentation described MUSICAM, a form of which is the basis for PASC encoding used in Philips's DCC. The MUSICAM system has been adopted as the standard by the Eureka 147 partners for DAB, the system destined to replace AM and FM radio transmission.

MUSICAM can operate at a variety of "compression" rates, from 4:1 (192kb/s) to approximately 11:1 (64kb/s), with 128kb/s per channel chosen for DAB. If Louis Fielder was the most conservative and cautious about bit-rate reduction, Yves-François Dehery, a designer of MUSICAM, was the least. He seemed the most zealous in getting data rates even lower, using terms like "informational irrelevance" and "psychoacoustic redundancy." He also believes that "92% of the signal is redundant," and suggested that MUSICAM is "suitable" for encoding original master recordings in digital audio workstations.

ARTICLE CONTENTS

COMMENTS
rpali's picture

The editor's introduction incorrectly includes Monkey's Audio in the list of lossy compression methods. It is indeed lossless.

Rick.

John Atkinson's picture

rpall wrote:
The editor's introduction incorrectly includes Monkey's Audio in the list of lossy compression methods. It is indeed lossless.

You are correct. Brain fart on my part. I have removed it from the list of lossy codecs.

John Atkinson

Editor, Stereophile

deftoejam's picture

An article on 20+year old lossy codecs, really?  Is this the internet wayback machine?

For the good of high end audio, it's time to retire, Robert.  Stereophile, you can do WAY better for an article on lossy codecs.  The only danger is staying the course with Robert and his discussions of old codecs.

You wanna attract a larger high end market?  Write about how LAME 3.99.5 V2 or better is audibly essentially transparent with music compared to lossless.  Or write about OPUS 1.1 for low bitrate uses.  Or how spending wisely on a system like foobar2000 or JRiver playing FLAC or 3.99.5 V0 with a $30 Behringer DAC with a $60 calibrated microphone and REW, using Blue Jeans cables, will allow more funds for better, really cool speakers, multiple subwoofers and room treatments.

John Atkinson's picture

deftoejam wrote:
An article on 20+year old lossy codecs, really?  Is this the internet wayback machine?

My goal is eventually to have everything that was published in Stereophile available in our free on-line archives. I thought this article from 22 years ago would be of interest.

deftoejam wrote:
Write about how LAME 3.99.5 V2 or better is audibly essentially transparent with music compared to lossless.  Or write about OPUS 1.1 for low bitrate uses.

With hard-drive prices at an all-time low and fat "pipes" becoming the domestic norm, why would anyone need their music encoded with a lossy codec at all, if you intend to listen to that music seriously? Despite advances in codec technology, there is yet to be a lossy codec that is transparent to all people at all times on all systems with all types of music.

So why not just use FLAC or Apple Lossless for your music library and forget about the possible sonic compromises of a lossy codec, other than where it makes something possible that would otherwise be impossible, such as listening to a live concert from the UK's BBC 3 via the Internet?

 

John Atkinson

Editor, Stereophile

SAAudio's picture

Did you not read John's intro? Did you not notice the published date of Dec 1 1991? Robert has not written for stereophile for years. He is editor of another magazine.

dalethorn's picture

Great article and essential reading for manufacturers and discerning consumers (i.e. audiophiles). Or as someone once said "Those who don't have a grasp on the history of their circumstances are likely to repeat some of its mistakes".

After all, as in the case of LAME 3.99xyz, how many times have we heard that since the beginning of digital?  I convert my FLAC tracks to 320k MP3's for playing outdoors and on public transport, but I don't lose the originals. As Cookie Marenco has advised, best to get the earliest master possible, even in digital.

Archimago's picture

Interesting historical reference. But why is this "essential reading"? Nothing wrong with the contents but in 2013, I think it's fair to say that we've all experienced it (MP3, AAC, Ogg Vorbis...) and there's really no big "danger" here so let's not be too emotional about all of this.

I think many of us feel that 320kbps is indistinguishable qualitywise from lossless FLAC or ALAC but that's different from advocating wholesale conversion of music archives to a lossy format! It does however at least help put our expectations into perspective and when/if I need to go portable with my music, conversion to MP3 isn't hysterically treated as if it were some kind of "big deal". No great danger, no boogeyman, no monster.

This article talks about listening tests using MP3 at 128kbps. I think we're all quite aware of this bitrate as not being enough to ensure CD-equivalent sonic integrity and as far as I'm aware, no commercial service has been selling music at this resolution for years... I suppose it's still used for streaming radio, but again, no audio lover I know of would be mistaken with considering this bitrate as true high fidelity.

Better to keep putting attention on squashed dynamics due to the "loudness wars" than harp on the minimal effects of high bitrate lossy compression these days.

dalethorn's picture

The reason why the attention would be prioritized on lossy files over loudness wars is because the former has permanence based on the reality of the files being archived.  Worse I think is that the lossy files (on average) are more likely to be married with loudness adjustments than lossless files. Your experience could vary a lot, and I can't declare a hard-and-fast rule on any of this, but just ask the question around audiophile communities: "If loudness increases bother you, would you be more likely to find those loudness increases in lossy or lossless media?"

Archimago's picture

I see no difference over the years between loudness of MP3 vs. CD rips. If the original master is loud, it's loud. It's not like record companies release 2 versions - less compressed for CD release, louder for Amazon/iTunes as far as I can tell. It's the vinyl releases which tend to be less loud at least in a large part due to limitations of the technology.

Again, nobody's advocating archiving with a lossy process. MP3 works and serves its purpose with minimal if any perceivable sound degradation to human ears/brains at commonly used (256/320) bitrates in 2013.

John Atkinson's picture

Archimago wrote:
Again, nobody's advocating archiving with a lossy process...

Back at the time this article was published, they were :-(

John Atkinson

Editor, Stereophile

malvrich's picture

In the '80s we knew that cassette was an easily audible downgrade to vinyl albums but many if not most of us were happy to record our albums and play the cassettes, not only in our cars, but often at home just to preserve the condition of the albums.

Music_Guy's picture

What we are trying to do is to capture a musical performance perfectly, store it permanetly and to reproduce it perfectly later on.

There have always been constraints on the process like size, cost, technological state of the art.  In the digital domain, lossy compression was a response to some of those constraints like storage space and transmission bandwidth.

If you want lots of music in a small digital player or you want to stream music over limited bandwidth media, you are still need lossy compression of the digital source material.  But as constraints fall away we can usher in a new era of ever increasing reproduction fidelity in the digital domain.  Lossy compression will seem quaint in a few years.

I hope the recorders of the performance will step up and improve the quality of their source files. (Dynamic range, distortion, etc.) We as "audiophiles" can still pursue our pastime of reproducing those source files as accurately/musically as possible.

acuvox's picture

The lossy codec tests suffer from the same flaw that all audiology has since the 1930's when they started using a vacuum tube sine wave oscillator.  The test subjects listen to music predominantly if not exclusively through speakers.  This means their psycho-acoustic processing is acclimated to the temporal and spatial distortions of speakers.   In fact, it appears from the description that the "Expert listeners" are people who listen to speakers for a living.

Professional acoustic musicians hear differently.  In particular, they hear phase and have a much richer perception of acoustic space.  The latest research indicates that they have far more inter-connection neurons especially traversing the Corpus Collossum, and that the increased neuro-genesis is driven by focused listening to acoustic sounds in childhood. These are the only "Expert Listeners" to music (as opposed to reproduction) in the industrialized world.

I have been working with conservatory trained musicians who have heard acoustic music at least two hours a day from childhood, and they agree that MP3, AAC and internet streaming CODECs are not merely detectable, but un-listenable.  Even at 320K, I fatigue in under a half hour and take several hours to recover.

The information lost in bit reduction is largely the low level discrete multi-bounce echoes that illuminate the space where the recording took place.  I  take exception to the test tracks on that basis.  

Two are processed pop recordings that started as mono close miked, deliberately colored center-electrode "vocal microphones" in a dead studio environment, two are artificial bass instruments with no acoustic reference.  Of the transient signals, the fireworks were undoubtably recorded at a large distance which is effectively an acoustic peak limiter by absorbtion of high frequencies and phase shifting the remaining spectrum, and has no acoustic space reference like a rectangular room.  Besides, nobody hears fireworks often enough to remember what they sound like.  The Glockenspiel and Castanet tracks likely share some of these characteristics.  

Modern trumpet typically is played legato with no transients to serve a time makers for echo decoding.  This leaves the one jazz track which I don't know and male speech.  If the male speech were recorded in near coincident stereo in a reverberant environment it may indicate something, but this unlikely.

Test signals should be pure acoustic recordings with no processing, complex and yet clear like a chamber orchestra with one player per part.  Staccatto technique is essential to exercise the codec response to real acoustics, including staccatto acoustic bass instruments.  Further, the test signal should be a voice with which the test subject is acclimated by recent experience (less than 24 hours).  My favorite test signal is harpsichord because it is the closest acoustic equivalent to a Dirac impulse function and I hear it daily.

X