Ringing False: Digital Audio's Ubiquitous Filter
A couple of years ago, when the issue of the "energy smear" caused by digital filters was red hot (it's still smoldering), I determined to address some of the controversies surrounding this issue through what I reckoned to be a rather neat experiment. It involved designing a set of related digital filters, notionally for 44.1kHz anti-aliasing purposes, and applying them to 24-bit/96kHz material transferred to hard disk from DVD-Video discs (mostly Classic Records DADs) using a DVD player that supports 24/96 via its S/PDIF output (I used a Pioneer DV-939A). The whole project went swimmingly until it came to the vital listening comparisons, which took place in circumstances that proved, in the event, way too informal to elicit useful results. With a pressing deadline, there was no opportunity for a reprise, so the article was written and no useful conclusion reached. It rankled, but, as other projects commanded my attention, the subject was relegated to the Pending tray.
And there it might have remained had not Roy George of Naim Audio mentioned the piece to me during a conversation at the AES UK Conference earlier this year, reopening the old wound. So I asked JA: Please can I do this again, for Stereophile, and get it right this time? He agreed, and here we are: confronting a still touchy issue in digital audio while exorcising a personal demon.
Since the earliest days of digital audio, engineers have had to confront the issue of high-rate, low-pass filters and their possible effect on sound quality. Claude Shannon's sampling theorem (footnote 1), on which the whole of digital audio is based, states that the information contained within a continuous signal will be fully captured by a discrete time-sampling process provided that its amplitude is recorded at a rate at least twice that of the highest component frequency. To ensure this, and to prevent the aliasing distortion that results if this condition is breached, the first stage in any analog-to-digital converter is a low-pass filter, usually called an "anti-alias filter" to clarify its function.
There are two quite separate issues to take into account when determining whether such low-pass filtering is audible or not. First and most obvious, the removal of frequencies above the filter's corner frequency may itself have audible consequences. It has often been assumed that if the filtering removes frequencies only above the accepted upper limit of human hearing (nominally 20kHz), then no audible effect is possible. But various experiments have established that we do indeed respond to ultrasonic frequencies that, when presented separately, we cannot hear (footnotes 2, 3). So it can't be taken for granted that the removal of these frequencies is perceptually benign.
Second, there are side effects to the filtering that may have undesirable consequences within the passband. In particular, the filter may introduce phase distortion, and its oscillatory behavior ("ringing") may modify the signal in the time domain. These factors—phase distortion and the nature of the filter's impulse response—are not separate but intimately related.
These issues were not lost on the academic world at the time of digital audio's inception. So, as you would expect, listening tests were undertaken to establish formally that the anti-alias filtering required for a nominally flat response to 20kHz, even at the 44.1kHz sampling rate—the sternest test because the transition band from 20kHz to 22.05kHz is so narrow, demanding an extremely steep filter rolloff—has no audible effect. Many informal listening tests were conducted too, often using Sony's PCM-F1, because it was the first 16-bit digital recorder that most people were able to lay hands on. The outcomes of this testing, formal and informal, were overwhelmingly positive. Many PCM-F1 users claimed that a signal passed through the machine's A-to-D and D-to-A stages was indistinguishable from the feed, and some still cite that experience as proof that 16-bit/48kHz audio, properly realized, is essentially perfect.
But even in the earliest days of domestic digital audio there were dissenting voices. Many hi-fi writers, myself included, were thoroughly underwhelmed by our initial experiences of Compact Disc, and so were some influential audio professionals, such as Doug Sax. Over a period of some years the intensity of this opposition to CD decreased somewhat, but many commentators and ordinary audio consumers concluded that there was something fundamentally amiss with 16/44.1 and 16/48 audio. Many of them voted with their feet, continuing to prefer the sound of the "obsolescent" LP.
Various explanations emerged as to why digital audio might be failing to convince a substantial proportion of enthusiast listeners. The importance of correct dithering during quantization and requantization became apparent. So, too, did the significance of jitter. Meanwhile, the development of psychoacoustically designed noiseshaping began to offer greater than 16-bit performance over critical parts of the audible spectrum.
Footnote 1: Yes, I know it is now accepted that Shannon was not the first to describe this, but his is the name in most of the textbooks.
Footnote 2: T. Oohashi et al, "High-Frequency Sound Above the Audible Range Affects Brain Electric Activity and Sound Perception," Preprint 3207, 91st Audio Engineering Society Convention (1991).
Footnote 3: T. Oohashi et al, "Inaudible High-Frequency Sounds Affect Brain Activity: Hypersonic Effect," Journal of Neurophysiology 83 (2000): 3548–3558.