MP3 vs AAC vs FLAC vs CD Page 2

For reference, fig.3 shows the spectrum of the signal on the CD. Other than the well-defined green vertical lines representing the tones and the uniform background noise, the spectrum is clean. Important points to note with this graph are that 1) all musical fundamentals lie to the left of the 4000Hz (4kHz) mark; 2) the region between the next three divisions, 4kHz, 8kHz and 16kHz, is where musical harmonics and the "air" on a recording reside; and 3) the region above 16kHz—more than a quarter of the horizontal scale—will be inaudible to most adults.

Fig.3 Spectrum of 500Hz-spaced multitone signal at –10dBFS, 16-bit linear PCM encoding (linear frequency scale, 10dB/vertical div.).

Fig.4 shows the spectrum of this demanding signal as preserved by lossless coding, in this case the popular FLAC codec (at its slowest "8" setting). To all intents and purposes, it is identical to the spectrum of the original CD. The lossless coding is indeed lossless, which I confirmed by turning the FLAC file back to WAV (LPCM) and doing a bit-for-bit comparison with the signal used to generate fig.3. The bits were the same—the music will also be the same!

Fig.4 Spectrum of 500Hz-spaced multitone signal at –10dBFS, FLAC encoding (linear frequency scale, 10dB/vertical div.).

How did the MP3 codec running at 128kbps cope with the multitone signal? The result is shown in fig.5. The dark red vertical lines represent the tones, and none are missing; the codec has preserved them all, even those at the top of the spectrum that will be inaudible to almost every listener. But the background noise components, which on the CD all lay at around –132dB, have all risen to the –85dB level. With its limited bit budget, the codec can't encode the tones without reducing to almost half the 16 bits of CD resolution. Even with the masking of this noise in the presence of the tones implied by psychoacoustic theory, this degradation most certainly will be audible on music. Yes, this kind of signal is very much a worst case, but this result is not "CD quality."

Fig.5 Spectrum of 500Hz-spaced multitone signal at –10dBFS, MP3 encoding at 128kbps (linear frequency scale, 10dB/vertical div.).

How about other lossy codecs? I looked at how the iTunes AAC codec (a version of MPEG 4, a later development than MP3) performed on this test, running at the same 128kbps. The result is shown in fig.6. At first it looks very similar, to fig.5, but there are significant differences. Note that almost all the tones above 18kHz are missing and that those above 16kHz are increasingly rolled off. The designers of the codec obviously decided not to waste the limited bit budget by encoding information that would most probably not be heard even from the CD. Instead, they devoted those resources to a more accurate depiction of the musically significant regions at lower frequencies. You can see in this graph that, below 4kHz, the noise level is 10–20dB lower than with the MP3 codec (though perhaps more "granular"). In effect, in the frequency region that is musically most important, an AAC file with this test signal has 2–3 bits more resolution than an MP3 file with the same bit rate. The AAC noise floor is higher than the MP3 noise floor between 8kHz and 18kHz, but given the physics of human hearing, this is insignificant.

Fig.6 Spectrum of 500Hz-spaced multitone signal at –10dBFS, AAC encoding at 128kbps (linear frequency scale, 10dB/vertical div.).

The degradation is dependent on bit rate—the higher the bit rate, the bigger the bit budget the codec has to play with and the fewer data must be discarded. I therefore repeated these tests with both lossy codecs set to 320kbps. The file size is three times that at 128kbps, though still significantly smaller than a lossless version, but are we any closer to "CD quality"?

Fig.7 shows the spectrum produced by the MP3 encoder running at 320kbps. (This is the format used by Deutsche Grammophon for its classical downloads.) Again, all the tones are reproduced correctly, and the noise has dropped by around 6dB or so at higher frequencies and up to 15dB at lower frequencies. But it is still not quite as low as AAC at 128kbps below 1kHz or so.

Fig.7 Spectrum of 500Hz-spaced multitone signal at –10dBFS, MP3 encoding at 320kbps (linear frequency scale, 10dB/vertical div.).

AAC at 320kbps now encodes all the tones, even the inaudible ones at the top of the audioband (fig.8). The noise floor is quite high above 18kHz, but—and it's a big "but"—the noise-floor components have dropped to below –110dB below 16kHz, and to below –120dB for the lower frequencies. Though some spectral spreading can be seen at the bases of the vertical lines representing the tones, it is relatively mild. Given the bigger bit budget at 320kbps, the AAC codec produces a result that may well be indistinguishable from CD for some listeners some of the time with some music. But the spectrum in fig.8 is still not as pristinely clean as that of the original CD in fig.3.

Fig.8 Spectrum of 500Hz-spaced multitone signal at –10dBFS, AAC encoding at 320kbps (linear frequency scale, 10dB/vertical div.).

For my final series of tests, I used Test CD 3's track 26, which replaces some of the tones in track 25 with silence. The spectrum of the CD original is shown in fig.9. You can see clean vertical lines representing the tones, with silence in between. You can see the random background noise below –130dB, as expected. Also as expected, encoding with FLAC gave the identical spectrum, so I haven't shown it.

Fig.9 Spectrum of multitone signal with frequency gaps at –10dBFS, 16-bit linear PCM encoding (linear frequency scale, 10dB/vertical div.).

MP3 at 320kbps gave the spectrum shown in fig.10. All the tones are present, but if you look closely, you can see some extra ones, at low levels. The noise also leaks into the spaces between the groups of tones. AAC at 320kbps gave the spectrum in fig.11. Again, there is much more noise and less resolution above 18kHz, where it doesn't really matter. Again, the noise around the groups of tones is lower than with MP3 at the same bit rate. Some low-level spurious tones can be seen in the spaces between the groups of tones; though there are more than with MP3, these are all lower in level. The noise floor between the groups is also higher in level than with MP3, but is still low in absolute terms.

Fig.10 Spectrum of multitone signal with frequency gaps at –10dBFS, MP3 encoding at 320kbps (linear frequency scale, 10dB/vertical div.).

Fig.11 Spectrum of multitone signal with frequency gaps at –10dBFS, AAC encoding at 320kbps (linear frequency scale, 10dB/vertical div.).

What does all this mean?
Basically, if you want true CD quality from the files on your iPod or music server, you must use WAV or AIF encoding or FLAC, ALC, or WMA Lossless. Both MP3 and AAC introduce fairly large changes in the measured spectra, even at the highest rate of 320kbps. There seems little point in spending large sums of money on superbly specified audio equipment if you are going to play sonically compromised, lossy-compressed music on it.

It is true that there are better-performing MP3 codecs than the basic Fraunhöfer—many audiophiles recommend the LAME encoder—but the AAC codec used by iTunes has better resolution than MP3 at the same bit rate (if a little noisier at the top of the audioband). If you want the maximum number of files on your iPod, therefore, you take less of a quality hit if you use AAC encoding than if you use MP3. But "CD quality"? Yeah, right!

ARTICLE CONTENTS
Share | |

X
Enter your Stereophile.com username.
Enter the password that accompanies your username.
Loading