MQA: Aliasing, B-Splines, Centers of Gravity

The right thing at the wrong time is the wrong thing.—Joshua Harris

The sampling theory formulated by Claude Shannon in the late 1940s had a key requirement: The signal to be sampled must be band-limited—that is, it must have an absolute upper-frequency limit. With that single constraint, Shannon's work yields a remarkable result: If you sample at twice that rate—two samples per period for the highest frequency the signal contains—you can reproduce that signal perfectly. Perfectly. That result set the foundation for digital audio, right up to the present. Cue the music.

However, in the 69 years since Shannon published "Communication in the Presence of Noise" in the Proceedings of the Institute of Radio Engineers (footnote 1), sampling theory has moved on. The new work began almost immediately, carried out by mathematicians and math-fluent engineers; indeed, some of it had already been done when that paper was first published, in January 1949. But even as Shannon's work was embraced by the digital audio community—in audio, Shannon sampling theory is the foundation for almost everything digital—that post-Shannon work remained hidden. As recently as the 1990s, when post-Shannon sampling was applied to digital signal processing, the focus was almost entirely on visual information—imaging. With just a handful of exceptions, the audio world was oblivious.

Post-Shannon sampling theory relaxes Shannon's requirement that a signal to be sampled—eg, a recording of music—be band-limited to half the sample rate. Relaxing that constraint restores the symmetry between the time and frequency domains that was missing from Shannon's theory. In the newer theory, it's fine to use an antialiasing filter, but it's not required. Post-Shannon sampling accepts aliasing as a matter of course while allowing its impact to be minimized in both the time and frequency domains.

Some years ago, Bob Stuart and Peter Craven—the creators of Master Quality Authenticated, or MQA—began exploring some similar ideas in an audio context. Their first article hinting at the technology that eventually became MQA (footnote 2) referred to some post-Shannon work, and in some later writings—including "MQA: Questions and Answers," published on the Stereophile website in August 2016—the references are fairly explicit. I began to wonder: Is MQA a rigorous application of post-Shannon sampling theory?

Considering the newer theory's relaxation of Shannon's absolute prohibition against aliasing, this seemed a reasonable entry point for a somewhat technical interview with MQA's Bob Stuart on aliasing and its effects in MQA. The interview, carried out mainly by e-mail, is presented here in slightly compressed, lightly edited, occasionally annotated form.

Jim Austin: Is MQA an application of post-Shannon sampling theory to audio coding?

Bob Stuart: Yes, very definitely, it is!

Austin: Is it the first such application?

Stuart: Yes. So far as we know, this is the first.

Austin: Is it a rigorous application of post-Shannon sampling theory?

Stuart: Yes, we believe so. MQA stands on a firm basis which synthesized intuition of desirable characteristics, the mathematics of sampling and reconstruction based on B-splines, losslessly reversible processing (used in flattening)—and was informed by empirical observations, auditory modeling, and hundreds of experiments.

Austin: MQA's critics have often focused on aliasing. In a patent application covering MQA technology, you claimed the invention of "a system . . . wherein . . . the asymmetric component of response of the decimation filter is characterized by an attenuation of at least 32dB at frequencies that would alias to the range 0–7kHz on decimation [footnote 3]. Is that specification—attenuated by at least 32dB at frequencies that would alias to the range 0–7kHz on decimation—realized in MQA's implementation?

Stuart: In all cases the aliasing heard by a listener with an MQA decoder will be well below that implied in the quoted patent claim and will be, we claim, either inaudible or nonexistent.

[To test this claim, beginning with a FLAC file containing white noise at –10dBFS peak, I pasted in 20 seconds from Talking Heads' "Girlfriend Is Better" at the 20s mark, 50dB below the noise, repeating the music and increasing the level by 10dB every 20 seconds. I can detect very faint drums at the one-minute mark, 30dB below the noise level and at least 30dB louder than the aliased-content level allowed by the MQA specification. By 20dB below the noise—40dB above the spec—I could hear the music clearly. My conclusion: The specified level of aliasing is not audible, with a safe safety margin. The use of white noise instead of a 1/f, music-like signal makes this a very conservative test.

Your mileage may vary, so do the test yourself with the embedded audio file. At first you will just hear the noise, but you will then start to hear the music beneath the level of the noise.]



Footnote 1: Proceedings of the IRE, January 1949, Vol.37 No.1, pp.10–21. Reprinted in Proceedings of the IEEE, February 1998, Vol.86 No.2, pp.447–45. See https://web.archive.org/web/20100208112344/http://www.stanford.edu/class/ee104/shannonpaper.pdf.

Footnote 2: J. Robert Stuart and Peter Craven, "A Hierarchical Approach to Archiving and Distribution," AES Paper 9178 (8 October 2014).

Footnote 3: Note that the claim is that aliased content will be attenuated by 32dB, not that it will be 32dB below the regular audio in the specified range. At CD sampling rates, at which aliasing would typically be strongest, aliasing at the top of this range will be reflected down from about 36kHz. In that frequency range, the music in an audio file is already very low in level. I estimate that if this specification is met, the aliased content will be at least 60dB below the musical information at 7kHz.—Jim Austin

X