DVD, Yes. 96kHz, No!
The Digital Versatile Disc was originally proposed by a consortium of Toshiba, Matsushita, and Time Warner as a carrier of digital video. Sony and Philips, co-inventors of the Compact Disc, were developing their own high-density format but eventually agreed to a joint format with the Toshiba-led group.
DVD would replace both laserdisc and the play-only function of VHS tape—movie sales and rentals. The format's massive data capacity—up to 17 times that of a conventional CD—also made DVD ideal for a high-quality audio-only format that would free us from the bottleneck of the CD's 44.1kHz sampling rate and 16-bit resolution.
The great musical promise of a consumer format with 20-bit resolution and a high sampling rate has a bitter flip side: The millions of listeners who continue to use their existing CD players to play 44.1kHz CDs will hear degraded performance.
Why? The Toshiba-led consortium wants the audio-only format to have a sampling frequency of 96kHz. If 96kHz becomes the standard for the consumer release format, professional recorders will also operate at 96kHz. Because 44.1kHz CDs will be with us for a long time, high-resolution digital master recordings must be sample-rate–converted to 44.1kHz for the conventional CD release. For at least a decade, most listeners will play 44.1kHz CDs.
The problem is that unless unrealistically complex interpolating digital filters are used, sample-rate–converting from 96kHz to 44.1kHz will degrade sound quality. This point was illustrated by Dr. James Moorer, co-founder of Sonic Solutions, who presented a paper at the Audio Engineering Society Convention in November 1996 explaining mathematically why converting 96kHz to 44.1kHz can introduce audible artifacts (footnote 1). The paper shows the example of a 1kHz sinewave originally sampled at 96kHz after it is converted to 44.1kHz. The conversion process produces distortion products spaced 300Hz apart around the 1kHz tone (fig.1). In other words, all those spikes you see sticking up in the illustration are distortion components added to the music by sample-rate conversion.
Fig.1 Error spectrum after downsampling 1kHz tone from 96kHz to 44.1kHz sampling (after Moorer) (dBFS vs frequency; note linear frequency scale).
Not all sample-rate conversion causes this harm to the music. Conversion from 88.2kHz to 44.1kHz involves no degradation. Rather than using interpolation filters required for 96kHz-to-44.1kHz conversion, 88.2kHz-to-44.1kHz is simply a process of low-pass filtering then, in effect, discarding every other sample. Consequently, converting 88.2kHz to 44.1kHz introduces no distortion artifacts. Moreover, 96kHz has no sonic advantage over 88.2kHz; the slight additional bandwidth provided by 96kHz sampling is negligible.
The first audience question after the presentation of the Moorer paper was from the esteemed Dr. Stanley Lipshitz of the University of Waterloo. He asked why, given the problems Dr. Moorer had just described, is 96kHz even proposed as a standard over 88.2kHz?
The answer has nothing to do with music, technology, mathematics, or the quality of reproduced sound. The decision to make 96kHz the standard is instead based on corporate politics. Toshiba wants the new DVD to have as little to do as possible with the Sony/Philips Red Book standard, even to the point of arbitrarily changing the sampling frequency. Publicly, Toshiba states that the audio-only DVD requires a 96kHz sampling frequency. They argue that because Dolby Digital AC-3 (the 5.1-channel discrete digital audio format chosen for DVD movies) is based on 48kHz, the audio-only DVD format must use a multiple of 48kHz. The theory is that new DVD combination players won't have clock frequencies based on 44.1kHz. This argument is, however, specious: Any DVD player that will play conventional CDs will have a clock related to 44.1kHz that could easily work for 88.2kHz-sampled DVD audio discs.
This situation is particularly ironic because 48kHz and 96kHz aren't related mathematically to any video frequencies, but 44.1kHz and 88.2kHz are. The frequency of 44.1kHz was chosen in the first place for the CD because it is related to the video frame rate: A 44.1kHz-sampled signal could be easily stored on videotape.
There's a compelling case for a 96kHz sampling rate if your company wants to break the back of the Sony/Philips Red Book standard. But for the millions of music lovers who will continue to listen to 44.1kHz CDs, 96kHz sampling is anathema. Every time someone plays a conventional CD whose content has been converted from 96kHz to 44.1kHz, they will suffer degraded sound quality.
The technical parameters of new formats should be decided by engineers in laboratories and recording studios, not by MBAs sitting around a board table. I don't expect companies to be altruistic; they always make decisions that are in their own best financial interests. But we're not talking about just any consumer commodity here. This is about music, an art form whose meaning is carried by the physical manifestation of sound. To diminish in the name of corporate politics the joy of the musical experience for millions of listeners around the world is unconscionable.
The irony is that the solution to the sampling-rate conversion problem is so simple: Make 88.2kHz the sampling frequency of DVD and professional digital audio recorders. But history shows that such choices are made not on the basis of what's best for the music, but of what will provide the greatest enrichment to the companies calling the shots.—Robert Harley
Postscript: A decade after Robert Harley wrote the above words, things didn't turn out so bad after all. Yes, the 96kHz–44.1kHz conversion is still mathematically demanding, but hardware solutions like the dCS 972 and software solutions like Barbabatch, Bias Peak 5, and Adobe Audition can perform the conversion in a transparent manner. (The website http://src.infinitewave.ca compares 2007 sample-rate converters on this conversion.) And modern DVD players have no problem playing back material recorded at 44.1kHz or 88.2kHz.—John Atkinson
Footnote 1: "Breaking the Sound Barrier: Mastering at 96kHz and Beyond," preprint 4357. AES paper preprints are available from the Audio Engineering Society, 60 E. 42nd St., New York, NY 10165-2520. Web: www.aes.org.—Robert Harley