Red Shift: Doppler distortion in loudspeakers

In the world of digital audio, jitter has been a focus of audiophile attention for well over a decade. It is blamed for many of the sonic ills of which CD and other digital media have been accused. But here's a puzzle: The major source of frequency intermodulation distortion in audio systems—the loudspeaker—has largely escaped such withering inquiry. Why?

Two reasons spring to mind. First, the common origin of these distortions is obscured by the fact that they go by different names and are quantified differently. Whereas frequency intermodulation (FIM) in the digital context is called jitter and specified in units of time (because it is considered from the viewpoint of cause rather than of effect), in loudspeakers it is usually called Doppler distortion and specified, if at all, as a percentage figure, to align it with other forms of nonlinear distortion.

Never heard of Doppler distortion? Well, that reflects the second reason: Doppler effects in loudspeakers came and went as an issue between the 1960s and early 1980s, the consensus at the end of that period being that FIM distortion in typical hi-fi speakers is inaudible and therefore irrelevant. For many, that closed forever the subject of Doppler distortion. But accepting accepted wisdom can be a hazardous act of faith, particularly when it relies on listening tests conducted years ago and far away.

In what follows, therefore, I describe the results of my own recent efforts to assess the significance of Doppler distortion in loudspeakers, and determine whether the verdict of "irrelevant" is the right one. But before I do that, I must paint in some background: the origins of Doppler distortion in loudspeakers, the details of earlier research, and how Doppler relates to jitter. I also make a short detour into DiAural's "Doppler Decoding," which claims to exploit a loudspeaker's Doppler distortion to cancel that from the recording microphone(s).

Freight train, freight train...
The classic example of the Doppler effect is a locomotive sounding its horn as it passes a nearby listener. As the engine approaches, the wavelength of the sound is compressed and consequently increased in pitch; as it passes the listener and advances into the distance beyond, the wavelength is stretched and the pitch drops. As the locomotive passes, the result is the familiar Waa-ooo effect.

Something similar happens when a loudspeaker diaphragm reproduces a sound. While the diaphragm is moving toward the listener, the frequency of the radiated sound increases; as it moves away, the frequency decreases. But the effect is much less obvious here because, even during the large diaphragm excursions required at low frequencies, a loudspeaker cone moves relatively slowly. If a 100Hz tone is reproduced at a cone excursion of 1/2" from peak to peak of the waveform, for instance, the maximum cone velocity (as it passes through the resting position) will only be 4 meters per second—a leisurely 9mph. And as frequency increases, diaphragm velocity decreases for the same sound-pressure level.

With a pure tone, the Doppler effect simply adds harmonic distortion: principally, relatively innocuous second harmonic at practicable diaphragm excursions. But with the complex signals of music, where the diaphragm must reproduce lower and higher frequencies simultaneously, the consequences are more serious. In effect, the higher tones are frequency-modulated by the lower tones, giving rise to intermodulation sidebands. If the 100Hz tone of the previous example were accompanied by another at 1kHz, then FIM sidebands would appear at the sum and difference frequencies (900Hz and 1100Hz, respectively). The amplitudes of these sidebands relative to the higher frequency component depends on the cone velocity generated by the lower tone, and increases with the frequency difference. So if the higher frequency were 3kHz rather than 1kHz, the level of the sidebands would be 3x (9.5dB) higher. Of course, the frequency content of the typical music signal is a great deal more complex than this.

So is the pattern of FIM products—frequency modulation generally produces an infinity of sidebands, not the single pair mentioned above. In the case of loudspeakers, the modulation index is usually so low that the higher-order sidebands are at lower amplitude, for which reason it is established practice to consider only the first-order sidebands. But that shouldn't be taken to mean that higher-order components can be ignored. Fig.1 shows a simulated spectrum for the 100Hz/3kHz example above (with a cone excursion of 0.5" peak–peak). Although the first-order sidebands—as predicted by theory, –15.2dB relative to the 3kHz component—have the highest amplitude, the higher-order sidebands are at a level that can hardly be called insignificant.

Fig.1 Doppler spectrum for a 3kHz tone reproduced by a loudspeaker diaphragm also radiating a 100Hz tone with a peak–peak excursion of 0.5". For a 200mm (8") cone, this is equivalent to an SPL of 92.5dB at 3m (10'), assuming free-space conditions.

This is an admittedly simplified description of Doppler distortion in loudspeakers. (For a more complete mathematical treatment, I recommend Siegfried Linkwitz's Web page at www.linkwitzlab.com/frontiers.htm#J.) Still, two key messages can be taken from it.

First, the amount of Doppler distortion generated by a loudspeaker is dependent on the amplitude of the signal's low-frequency content, because this is the principal determinant of diaphragm velocity. This means that higher levels of Doppler distortion will be generated by music program with strong bass content.

Second, Doppler distortion worsens as the frequency difference between the modulating and modulated frequencies increases. This inevitably means that, for a given diaphragm size, Doppler distortion will be worst in loudspeakers that use a single, full-range drive-unit. Two-way speakers will suffer less, and three-ways with a lower bass/midrange crossover frequency will suffer less still. For this reason, my simulation models a two-way speaker, because this represents the worst case most of us are likely to encounter in an audiophile system.

X