MP3 vs AAC vs FLAC vs CD

As Wes Phillips recently reported on this website, CD sales are down and legal downloads of audio files are up. Stereophile has been criticized more than once for not paying enough attention to the subjects of MP3 and other compressed file formats, such as AAC, and for offering no guidance at all to readers about how to get the best sound quality from compressed downloads.

These criticisms are correct. We don't.

The reason is simple: Although they are universally described in the mainstream press as being of "CD quality," MP3s and their lossy-compressed ilk do not offer sufficient audio quality for serious music listening. This is not true of lossless-compressed formats such as FLAC, ALAC, and WMA lossless—in fact, it was the release of iTunes 4.5, in late 2003, which allowed iPods to play lossless files, that led us to welcome the ubiquitous Apple player to the world of high-end audio. But lossy files achieve their conveniently small size by discarding too much of the music to be worth considering.

In the past, we have discussed at length the reasons for our dismissal of MP3 and other lossy formats, but recent articles in the mainstream press promoting MP3 (examined in Michael Fremer's "The Swiftboating of Audiophiles") make the subject worth re-examining.

Lossless vs Lossy
The file containing a typical three-minute song on a CD is 30–40 megabytes in size. A 4-gigabyte iPod could therefore contain just 130 or so songs—say, only nine CDs' worth. To pack a useful number of songs onto the player's drive or into its memory, some kind of data compression needs to be used to reduce the size of the files. This will also usefully reduce the time it takes to download the song.

Lossless compression is benign in its effect on the music. It is akin to LHA or WinZip computer data crunchers in packing the data more efficiently on the disk, but the data you read out are the same as went in. The primary difference between lossless compression for computer data and for audio is that the latter permits random access within the file. (If you had to wait to unZip the complete 400MB file of a CD's content before you could play it, you would rapidly abandon the whole idea.) You can get reduction in file size to 40–60% of the original with lossless compression—the performance of various lossless codecs is compared here and here—but that increases the capacity of a 4GB iPod to only 300 songs, or 20 CDs' worth of music. More compression is necessary.

The MP3 codec (for COder/DECoder) was developed at the end of the 1980s and adopted as a standard in 1991. As typically used, it reduces the file size for an audio song by a factor of 10; eg, a song that takes up 30MB on a CD takes up only 3MB as an MP3 file. Not only does the 4GB iPod now hold well over 1000 songs, each song takes less than 10 seconds to download on a typical home's high-speed Internet connection.

But you don't get something for nothing. The MP3 codec, and others that achieve similar reductions in file size, are "lossy"; ie, of necessity they eliminate some of the musical information. The degree of this degradation depends on the data rate. Less bits always equals less music.

As a CD plays, the two channels of audio data (not including overhead) are pulled off the disc at a rate of just over 1400 kilobits per second. A typical MP3 plays at less than a tenth that rate, at 128kbps. To achieve that massive reduction in data, the MP3 coder splits the continuous musical waveform into discrete time chunks and, using Transform analysis, examines the spectral content of each chunk. Assumptions are made by the codec's designers, on the basis of psychoacoustic theory, about what information can be safely discarded. Quiet sounds with a similar spectrum to loud sounds in the same time window are discarded, as are quiet sounds that are immediately followed or preceded by loud sounds. And, as I wrote in the February 2008 "As We See It," because the music must be broken into chunks for the codec to do its work, transient information can get smeared across chunk boundaries.

Will the listener miss what has been removed? Will the smearing of transient information be large enough to mess with the music's meaning? As I wrote in a July 1994 essay, "if these algorithms have been properly implemented with the right psycho-acoustic assumptions, the musical information represented by the lost data will not be missed by most listeners.

"That's a mighty big 'if.'"

And while lossy codecs differ in the assumptions made by their designers, all of them discard—permanently—real musical information that would have been audible to some listeners with some kinds of music played through some systems. These codecs are not, in the jargon, "transparent," as can be demonstrated in listening tests (footnote 1).

So to us at Stereophile, the question of which lossy codec is "the best" is moot. We recommend that, for serious listening, our readers use uncompressed audio file formats, such as WAV or AIF—or, if file size is an issue because of limited hard-drive space, use a lossless format such as FLAC or ALC. These will be audibly transparent to all listeners at all times with all kinds of music through all systems.

Putting Codecs to the Test
Do I have any evidence for that emphatic statement?

For an article published in the March 1995 issue of Stereophile, I measured the early PASC, DTS, and ATRAC lossy codecs and put four of the test signals used for that article on our Test CD 3 (Stereophile STPH006-2). For the present article, I used two of those signals, tracks 25 and 26 on Test CD 3. But first, to set a basis for comparison, I used that most familiar of test signals: a 1kHz tone.

The spectrum of this tone, played back from CD, is shown in fig.1. The tone is the sharply defined vertical green line at the left of the graph. There are no other vertical lines present, meaning that the tone is completely free from distortion. Across the bottom of the graph, the fuzzy green trace shows that the background noise is uniformly spread out across the audioband, up to the 22kHz limit of the CD medium. This noise results from the 16-bit Linear Pulse Code Modulation (LPCM) encoding used by the CD medium. Each frequency component of the noise lies around 132dB below peak level; if these are added mathematically, they give the familiar 96dB signal/noise ratio that you see in CD-player specifications.

Fig.1 Spectrum of 1kHz sinewave at –10dBFS, 16-bit linear PCM encoding (linear frequency scale, 10dB/vertical div.).

Fig.2 shows the spectrum of this tone after it has been converted to an MP3 at a constant bit rate of 128kbps. (The MP3 codec I used for this and all the other tests was the Fraunhöfer, from one of the original developers of the MP3 technology.) The 1kHz tone is now represented by the dark red vertical line at the left of the graph. Note that it has acquired "skirts" below –80dB. These result, I believe, from the splitting of the continuous data representing the tone into the time chunks mentioned above, which in return results in a very slight uncertainty about the exact frequency of the tone. Note also that the random background noise has disappeared entirely. This is because the encoder is basically deaf to frequency regions that don't contain musical information. With its very limited "bit budget," the codec concentrates its resources on regions where there is audio information. However, a picket fence of very-low-level vertical lines can be seen. These represent spurious tones that result, I suspect, from mathematical limitations in the codec. Like the skirts that flank the 1kHz tone, these will not be audible. But they do reveal that the codec is working hard even with this most simple of signals.

Fig.2 Spectrum of 1kHz sinewave at –10dBFS, MP3 encoding at 128kbps (linear frequency scale, 10dB/vertical div.).

But what about when the codec is dealing not with a simple tone, but with music? One of the signals I put on Test CD 3 (track 25) simulates a musical signal by combining 43 discrete tones with frequencies spaced 500Hz apart. The lowest has a frequency of 350Hz, the highest 21.35kHz. This track sounds like a swarm of bees, but more important for a test signal, it readily reveals shortcomings in codecs, as spuriae appear in the spectral gaps between the tones..



Footnote 1: Something I have rarely seen discussed is the fact is that because all compressed file formats, both lossless and lossy, effectively have zero data redundancy, they are much more vulnerable than uncompressed files to bit errors in transmission.
ARTICLE CONTENTS

COMMENTS
da5id's picture

I would expect to find reputable audiophile discussions to be well-thought out and current with the technology at issue. This article fails to meet those expectations. Its conclusions are stated succinctly:

"If you want the maximum number of files on your iPod, therefore, you take less of a quality hit if you use AAC encoding than if you use MP3. But "CD quality"? Yeah, right! "

The hard drive based Classic IPod is still sold. The Classic IPod is, not unexpectedly, smaller and technologically superior to the original, including a proprietary Apple lossless format. It "holds" 260 GB. Your strawman has 4 GB. Almost no one makes a decent MP3 player with only 4 GB. In any event, all of the solid-state ones manufactured reputable companies go up to 64 GB. All of these players support FLAC, either natively or with an app.

Now let us examine the author's numbers applied to existing technology. We'll use the worst-case scenario for FLAC, 60%, on an 800 MB CD. That equals 480 MB per CD and we'll round that up to 500 MB for easy calculations going forward. Those numbers would yield 128 CDs on a run-of-the-mill 64 GB flash drive based MP3 player and 520 CDs for the Classic iPod. And they can be swapped out every night, all ready for the next morning, with whatever you want. I'm sure there is an app for that.

I read an interesting article in Slate by an audiophile not bemoaning the easy acceptance of inferior MP3s, but rather, the general lack of appreciation of live music and the failure of, even fetish, of audiophiles over things like $1600 phono cartridges at the expense of that music.

– Gene Girard

PS: You can find all the technical detail you want, far exceeding your own, in the Wikipedia articles on the topics of FLAC and Ogg Vorbis. FLAC is not the only, let alone optimal, bit-for-bit lossless compression technology.

 

drblank's picture

You took out a small sound bite from an article and you put down the entire article? Go read the entire article a couple of times

Your comment on the article seems like you didn't read or comprehend the article in the first place.

fergusof's picture

Thanks for the great article. It explains a lot about various audio formats that I didn't know.

Apparently, someone named da5id 'expects' to find 'reputable' audiophile discussions to be 'well-thought out' and 'current.' Perhaps da5id should note that the article is only current to 2008. Even so, perhaps da5id could give credit where credit is due and thank the author for spending much time and effort on this article instead of doing a a 'drive-by' flaming. Da5id, it seems, doesn't understand basic politeness and seems to feel that his 'expert' knowledge allows him to insult anyone whose knowledge, to him at least, doesn't compare to his. If you disagree with anything in Mr. Atkinson's article, then, by all means, initiate a polite discourse. But don't act like a know-it-all loudmouth. A loud bronx cheer to Mr. da5id.

Archimago's picture

Anyone interested in participating in an online survey of high bitrate MP3 vs. the original CD audio, come check out:

http://archimago.blogspot.com

Instructions and samples to download and listen to!

Take it to the next level beyond reading about it and put your ears, brain, equipment to the test... Taking survey submissions until the end of January 2013.

Archimago's picture

Happy New Year everyone!

A reminder with plenty of time to go before closing the data collection.

--- Originally posted on AudioAsylum ---
Since opening the high bitrate MP3 vs. CD test on Dec. 11th, I have received 41 responses so far to the detailed survey. I will not analyze the data until the end of January, but just eyeballing the spread of results (Set A, B, "same") is already quite interesting. The respondents have come from 4 continents so far and reading some of the comments, I really appreciate the time people have put into this!

Furthermore, it's great to see a nice spread of equipment used from inexpensive (but good) headphone gear all the way to megabuck $50K+ systems.

Although we "shoot the breeze" around here and have great discussions around the hardware (sometimes inflaming arguments), it is infrequent that we actually do something like this where we stand up and be counted based on the actual experience of listening. I know that this isn't strictly "scientific" and many variables cannot be controlled in an open test like this, but for us "non-pro's", this could be the closest we get to participating in something which I hope is educational and (hopefully) fun as a hobbyist beyond theoretical discussions.

If you haven't given this a try, I encourage you to take some time and give it a shot. Be involved in a simple "blind test" (perhaps the only time in one's life) knowing you've tried something like this and contributed to the data set (whether one believes it's significant or not).

Thanks again AbeC. for hosting the fast link! Much appreciated, bro.

PS: One request - could some of you who participate in audiophile discussions in Asia (India, China, Japan, Korea, Malaysia, Singapore, Indonesia, Philippines, etc...) spread the test around. Would love to get some data from those folks!

Get the test here:

http://archimago.blogspot.com

Archimago's picture

Test complete with 151 respondents! Results up on the blog...  I think many would find the results surprising.

KUppiano's picture

This is an interesting article, and the tests are worthy of consideration, but we have to consider why someone would use MP3 or AAC over FLAC or raw PCM to store their music. Certainly, many users have space constraints, but they still want to listen to their tunes on their desktop, laptop or portable machine. 30 years ago, they would have used a cassette machine, such as a Sony Walkman. 

Fast forward to MP3. Is there a perceptible difference from the original sound? Sometimes, if the bit rate is low enough, and/or if the listening environment or equipment is good enough. But once you get above 128 Kbps, those differences become quite insignificant. Not for all program material, not for all listeners, and all circumstances of course. But in general. I found cassettes to be virtually unlistenable, but at reasonable bit rates, I can usually enjoy listening to MP3 and AAC.

I would suggest that you run the same tests that you made with MP3s on cassette tape, and compare those results to MP3 or AAC. I think the digital formats even with all their faults, would win hands down. Wow, flutter, and frequency response -- as well as noise and distortion, would be much worse on cassette tape. Perspective, perspective, perspective.

You might also consider that, although the charts in your article look dramatically different, showing obvious disturbances in the force, the perceptual coders are just that: perceptual. You should expect to see differences when information is discarded. That's a given, and the charts will reflect that. The researchers who developed the algorithms worked very hard to minimize the perceptual trade-off. They did pretty well with MP3, and got much better with AAC. 

Ultimately, the question is, does it matter perceptually? And the answer is, "it depends". It depends on the bit rate, it depends on the perceptual model, it depends on the algorithm, it depends on the source material, it depends on the listening environment and the equipment, and most importantly perhaps, it depends on the listener - his tolerance for whatever the distortion is, and his skill at recognizing it.

I made a slightly different study, that I published on my blog. I subtracted the MP3 from the original sound in WAV format. The result was the difference between the original sound and the bit-rate-reduced version. It was quite interesting, and maybe a different way to look at this issue. 

Finally, with all types of storage prices dropping rapidly, I wonder how much longer perceptual coders such as MP3 or AAC will even matter (they might for downloading or streaming over the Internet, but even there, bandwidth is increasing too). We may continue to use perceptual coders, but if we can run them at 256 Kbps or even 512 Kbps or greater, will the losses even matter?

garyding2003's picture

I did an interesting test. The same song in 3 formats of ape, flac and mp3 of 320bit are played on my computer (gigabyte motherboard / realtek 889 onboard audio), itouch 4 and Sony Dej011 portable player. My headphone is Creative Aurvana Live.

The CD player has the best sound quality, warm, very clear and full of details; The second is Itouch 4, clear and full details but dry; The last one is my computer, dry, lack of details and a little distortion of music even when playing ape or flac format.

supra-mp3's picture

What I do is I remaster the original or not too bad files before I put it into a 320Kbps file that sound excellent in an Ipod or my car or even in my home sytem with great results . In some cases its better than it was before.I believe that format is the way to go . It gives excellent quality and the files are not too large.

Liam McMahon's picture

What's galling and upsetting about iTunes (and to a less extent Amazon) having created the market standard audio formats (mp4 and mp3 respectively) is that for, I would suggest, a majority of today's (younger) music buyers and listeners, they do not know the difference between the respective sound qualities of a CD and an mp3/mp4 (AAC).  And iTunes does very little if anything to educate people.  It is almost a pervasive urban myth that iTunes files are simply automatically CD-quality...and this is far from the case!

The galling and upsetting and dangerous things this does is:

1. People become used to inferior sound recordings, and do not even know about the concept of "hi-fi".  Surely this must over time impact on the quality of music produced, as what is heard is inferior, what gets produced must presumably also suffer? 

2. For the prices that iTunes charges, one would SURELY expect CD quality music! If an album is, say, $US10, how in the world can they sell vastly inferior quality files?  A physical CD is the same price or not much more (around $13.50 on Amazon for a new CD).  How can iTunes sell a tenth of the file information for virtually the same price?

Finally of all, why in the world can iTunes not offer the option of lossless files? Even if you end up paying MORE than you would for a CD (which is patently an absurd situation) - heck, even if you pay a lot more - there is absolutely no doubt that professional DJs, discerning hi-fi listeners, and audiophiles in general would gobble up those lossless files, even for a premium.

 

The whole entire situation is appalling. The fact is that for online music retailing - which has long since far outstripped its physical counterpart - there is quite simply NO MERCHANT OFFERING A WIDE RANGE OF CD QUALITY MUSIC!  Only at specialist dance music/DJ-orientated online retailers do you tend to have the option of WAV or other lossless (as far as I know - if there is a mass market retailer akin to iTunes who offer lossless files, somebody please tell me about them!) This is a sickening, dreadul, unthinkable situation, utterly perplexing, bizarre and crazy.

cd.vs.mp3's picture

In case you are interested, I have devised an online blind test CD vs AAC 256k. It is really a humbling experience for those who think compressed formats are not good enough. http://cdvsmp3.wordpress.com. Try it. I get good feedback from people who take it! There are also some interesting posts on HD audio using null testing.

pablolie's picture

let me start by saying that i have ripped all my favorite CDs as FLAC files using dbPoweramp. i did it because given the price of storage it would be unreasonable to not *store* them in reference quality.

as to listening purposes, *psychologically* i like listening to a FLAC, knowing i am getting every bit delivered to my DAC. but i have done countless tests between listening to something in FLAC vs 320k MP3 (or latest generation 256k VBR encoder) on what is pretty revealing equipment, and the differences on even very well recorded albums are at best minimal. with average recordings utterly undetectable.

of course any sort of deviations from loss-less will result in easily *measurable* differences, but the big question is if we can really tell the difference *listening* under most circumstances. certainly i wouldn't be able while i am listening and attentively reading the album cover at the same time. i agree that it doesn't make sense, given the price of storage and the money that we spend on equipment, to listen to compromised material having access to a better original (but don't tell that to vinyl lovers :-D).

but i also have no doubt whatsoever that i *can* and *should* have a lot of fun listening to high-quality MP3s at 256-320k of material encoded as such. they sound pretty darn awesome.

rantydave's picture

The 16 bit encoding on CD's means the output amplitude can be one of only 65536 values. Which is to say, it has a dynamic range of 96db. So everything below -100db on this graph can be thrown away as an irrelevancy i.e. -100db is the noise floor on the digital side. Look at the graphs again and only consider the bits above 100db.

Calling 85db "half the bits of CD resolution" is also entirely wrong, it's actually only two fewer bits (i.e. 14 bit not 16).

And temporal masking left the realm of theory a long time ago.

evangraj's picture

I get the lossless data and file choices from a listening perspective but which of these would you choose in order such that the music files themselves are tagged with the metadata. I have had it happen at least twice in my life where my digital collection is erased or drive failures mean that I have to rebuild my library again. Many CDs the 2nd time around are gone or damaged so I always end up loosing. I have recently just recovered again but have to rescan a few hundred CDs ... so in doing this, which format shall I choose which ALSO supports tagging of the music? Thoughts?

X