Loudspeaker Specifications: What Matters?

In a recent answer to a reader's letter, I somewhat bluntly stated that Stereophile's reviewers use "hi-fi" adjectives to describe loudspeaker sound because even good loudspeakers are too far removed from sounding "real" to be compared directly with live music. Upon reflection, this may have sounded too dismissive, so I will elaborate a little in this short essay.

What would be the specifications necessary for a loudspeaker to sound "real?" First, it should have a bandwidth as wide as possible; with the exception of a slight HF roll-off when you sit in the back row of the stalls at an orchestral concert, what you hear live is what the instruments emitted, so let's go for HF extension well above 20kHz at the top—40kHz, for example, giving us an octave's margin above the limit of hearing. At the low end, "DC" is not possible, though let's say a few Hz would be sufficient to minimize the effects of low-frequency group delay. The on-axis response should, of course, be absolutely flat.

There is still plenty of debate over whether a speaker should have a flat on-axis response or a flat power response—the reverberant response in the room should be flat with frequency, which implies a rising on-axis response with frequency in a conventional room. I would suggest that it is sufficient that the on-axis response be flat, the energy falling away smoothly as the listener moves increasingly off-axis, either vertically or horizontally, with no sharp discontinuities.

In live sound every frequency component constituting that sound reaches your ears at the same time, ie, a linear phase response. For the phase response of the loudspeaker, it is sufficient that it be identical to that of an electronic filter in that it should be minimum-phase; ie, having the appropriate relationship to the amplitude response so that flattening the latter flattens the former. For optimum amplitude stereo imaging, the loudspeaker must approximate either a point- or a line-source across the whole audio band.

Finally, measurements of the dynamics of live orchestral sound suggest that a peak spl capability of 115–120dB, unweighted, would suffice. Regarding non-linearity, let's specify under 1% THD and IMD across the audio band at 96dB average levels.

There isn't a speaker yet designed that comes anywhere near these specifications!

But don't despair. By choosing a set of reasonable compromises, a talented designer can end up with a loudspeaker that satisfies musically on an individual basis. The user just mustn't forget that every area of reproduction has necessarily had to be compromised somewhat in order for the listener to be able to afford to buy something other than a Wilson WAMM or Infinity IRS V.

With a loudspeaker intended to sell for under $1000/pair, the compromises are of necessity greater than usual and it takes a designer of genius to provide a model offering an all-round musical performance in this price range (footnote 1).

This short article was published in January 1989. Later that year I started using DR Labs' MLSSA system to measure loudpeakers, which I discussed in a February 1990 article. I have been using MLSSA ever since, the result being a unique archive of more than 900 loudspeaker reviews where the measurements are presented in a consistent format. An article comparing the measured frequency responses for the first two years in which I used MLSSA can be found here and "Getting the Best From Your Loudspeakers," an article on loudspeaker set-up, here.

In 1997 I presented an AES paper on loudspeaker measurements, based on my experience. This paper, split into three parts, can be found here here, and here. A related article can be found here.

Two videos in which I discuss audio measurements in general can be found here and here.

Footnote 1: Stereophile's founder J Gordon Holt wrote two articles on Subjective Loudspeaker Testing in the 1960s.

Hi-Reality's picture

Hi John, thank you indeed for this highly significant post. I have been working on development of a holistic system that is comprised of definitions and criteria of "Audio Realism" that relates to your points in this post. (just for the record, in this work: the topic of Audio Realism is much broader that Audiophile-ism; one could say that Audiophile-ism is a small subset of Audio Realism).

Could you please shed more lights or refer to any existing material of yours about this part you wrote so I can make sure I have understood your point fully?


"For optimum amplitude stereo imaging, the loudspeaker must approximate either a point- or a line-source across the whole audio band."

I will start study your AES papers, thank you for the links.

Regards, Babak
Founder, Hi-Reality Machines
Hi-Reality Sensorium (Youtube channel)

JRT's picture

Sound from a point source propagates at -6.02_dB SPL per doubling of distance.
20*log(1/2)= -6.02

Sound from a line source propagates in the nearfield at -3.01_dB per doubling of distance.
10*log(1/2)= -3.01

Sound from a line source propagates in the far field as a point source does. See comment about point source, above. The transition is related to line height and wavelength and propagation distance, so is more varied in spectral balance across a larger far field listening region, but is less varied in SPL across the near field as compared to a point source.

A discontinuous line array of separate radiating diaphragms can suffer interdriver interference at high frequencies, producing a comb filtered response at those frequencies, while magnitude and phase from each diaphragm might be varied in the sum.

There is also some nonlinearity in the propagation, more so over longer propagation distances (not really applicable in small rooms), and more so at higher frequencies. For some detailed info on extreme examples of that, websearch for Dr. F. Joseph Pompei's work on his Audio Spotlight at the MIT Media lab (circa turn of the century), and the decades older underlying US Navy patents on sonar technologies on which his initial efforts were based. Pompei's Audio Spotlight uses complementary inverse predistortion of ultrasonic signals into ultrasonic transducers to counter some of the aforementioned nonlinear distortion, toward producing sound in the very audible telephonic spectrum at a distant listening position, narrowly beamed in the propagation, and inaudible over much of the distance, much like some sonars were doing many decades ago.

avanti1960's picture

peak SPL specification as you referenced. I agree, it does matter.
Problem is that few manufacturers publish it and publications do not test for it (to my knowledge).
It seems like a nice number to compare against- I would definitely find it relevant.
Can Stereophile begin testing for it as part of your reviews?

Awsmone0's picture

Although a delightful memory of old technology
Don’t current loudspeaker measurements such as the Kippel NF scanner renderer such old school measurements antiquated

John Atkinson's picture
Awsmone0 wrote:
Although a delightful memory of old technology Don’t current loudspeaker measurements such as the Klippel NF scanner renderer such old school measurements antiquated?

The Klippel NFS is indeed a significant step forward in measuring loudspeaker performance. But as well as its cost being out of my reach, here in NYC the real estate it requires for its installation and operation is also out of my reach. :-(

But no, using MLSSA and other older tools is not "antiquated" providing that it gives me the correct answers. And that those answers are correct is something I work hard to ensure.

I have been thinking of using the MLSSA data to generate the same sort of "Spinorama" charts that the Klippel produces. However, as I write in this essay, I have currently shown the measurements for the several hundred loudspeaker reviews published on this website in a consistent manner, so that they can easily be compared. That, for me, is a strong argument for sticking with the current format.

John Atkinson
Technical Editor, Stereophile