Stereo & the Soundstage

The accuracy of a hi-fi system's "soundstage" reproduction seems to be of paramount importance these days, just as a component must now have "transparency" to possess hi-fi righteousness. If the system in which that component is used doesn't give good soundstage, then the system's owner has definitely fallen by the wayside. But what defines a good soundstage? Stereo imaging must have something to do with it, I hear you all cry. (I would have said stereo imagery until Larry Archibald pointed out that imagery has far less to do with hi-fi than with good writing, something I'm sure we agree has no place in a hi-fi magazine.) OK, what defines good stereo imaging?


A hand goes up at the back. Surely good stereo imaging is tied to a system's ability to present precisely positioned images in the lateral plane between the loudspeakers?

Well...yes and no. As AJ van den Hul points out in his interview in this issue, many hi-fi systems have good stereo (left-right) imaging, but the image is flat, like wallpaper. This, typically, is true for inexpensive CD players, which produce well-defined lateral stereo images but signally fail to provide the requisite degree of depth.

Smiles break out. Obviously, good soundstaging is dependent on the ability of a system to reproduce recorded reverberation tails, the ambience.

Well...yes and no. Mono 78s can accurately reproduce reverberation, yet no-one could accuse a mono system of having any soundstaging ability. It must be something to do, therefore, with the fact that without ever questioning the fact, all of us have systems that use two signal channels, driving two loudspeakers to produce two sets of soundwaves that coincide at our two ears.

It's obviously time to dig up a few basics. Reviewed in this issue is an AES Anthology of historic papers on "stereo." It includes a document (celebrating its 55th anniversary this month) that pretty much defined the whole field of stereo reproduction, including the 45°/45° stereo groove and the moving-magnet stereo cartridge. That document, a 1931 British Patent Application written by the English engineer Alan Dower Blumlein, is worth quoting at length:

"The fundamental object of the invention is to provide a sound recording, reproducing and/or transmission system whereby there is conveyed to the listener a realistic impression that the intelligence is being communicated to him over two acoustic paths in the same manner as he experiences in listening to everyday acoustic intercourse and this object embraces also the idea of conveying to the listener a true directional impression...An observer in the room is listening with two ears, so that echoes reach him with the directional significance which he associates with the music performed in such a room...When the music is reproduced through a single channel the echoes arrive from the same direction as the direct sound so that confusion results. It is a subsidiary object of this invention so to give directional significance to the sounds that when reproduced the echoes are perceived as such." (footnote 1)

In other words, if you can record not only a sound but the direction in space it comes from, and can do so for every sound wave making up the soundstage, including all the reflected sound waves (the reverberation or "echoes"), then you will be able to reproduce a facsimile of the original soundstage, accurate in every detail. In addition, because the spatial relationship between the direct and the reflected sounds will be preserved, that reproduced soundstage will give a realistic illusion of depth.

The Blumlein Patent Application mentions that the simplest way of carrying out the preservation of the soundstage is to use two microphones—spaced as far apart as the average pair of ears—when recording and playback over headphones: the "binaural" technique. This, however, makes headphone listening mandatory; history proves that headphones are about as popular as a headcold for relaxed, social listening. Blumlein was concerned with a system for playback over loudspeakers, and proposed a method of recording directional information as a ratio of amplitude differences between the two signal channels.

A murmur comes from the floor: surely the ear/brain uses more than amplitude information to determine the direction of sound sources?

This is true. The brain does use the amplitude difference between the signals reaching the two ears above about 2kHz, but below about 700Hz, it determines direction by looking at the phase difference between the signals; ie, it uses time information. (Both frequencies are proportional to the head size, so there will be a spread among individuals.) Things get a bit ambiguous between those two frequencies, but there are two other mechanisms still at work: first, the frequency-response modifications due to the shape of the pinnae differ according to the direction of the perceived sound, and second, the head is in continual lateral motion, sharpening up all the mechanisms by introducing second-order (rate-of-change) information. The result is that human beings—and animals—are very good at determining where sounds come from (unless they happen to consist of pure tones in the "forbidden" region, which is why birds, for example, use such tones as warning signals).

Blumlein's genius, however, lay in the fact that he realized that the low-frequency phase information can be replaced by corresponding amplitude information. If you have two independent information channels, each feeding its own loudspeaker, then the ratio of the signal amplitudes between those two loudspeakers will define the position of a virtual, phantom, sound source for a centrally placed listener equidistant from them. For any ratio of the sound levels of the two speakers, this virtual source occupies a dimensionless point somewhere on the line joining their acoustic centers. The continuum of these points, from that represented by maximum-left/zero-right to that represented by zero-left/maximum-right, makes up the conventional stereo image. If there is no reverberant information, then the brain will place the virtual image of the sound source in the plane of the speakers; if there is reverberation recorded with the correct spatial relationship to the corresponding direct sound, if it is "coherent," then the brain places the virtual image behind the speakers, the exact distance depending on the recorded direct-sound/reverberant-sound ratio.

Thus by recording amplitude information only in a two-channel system, we can create a virtual soundstage between and behind the loudspeakers.

Hands go up everywhere: but...but...but surely both ears receive the signal from both loudspeakers. Shouldn't this acoustic crosstalk work against the creation of a stereo image?

The facile answer is that, as the vast majority of people can perceive stereo images, it doesn't. The real answer is that, contrary to what you might have read in Polk's advertising, the brain is able to work out which signal is intended for which ear. If a wavefront reaches the left ear from the left speaker, the brain knows that that wavefront will reach the right ear around 0.7ms later, the time taken for the wave to travel around the head, and therefore can ignore it.

So there we have it: a perfect stereo image implies a perfect soundstage. All is rosy in the audiophile garden.

Hmm. A suspicious word, perfect. Where's the catch?

Well, we have only been discussing the interaction between the two loudspeakers and the listener. What about the amplitude-information only, two-channel recording? Where does that come from?

When it comes to recording music, there are two mutually incompatible philosophies. One is to capture as faithfully as possible the acoustic sound produced by a bunch of musicians, in effect treating a performance as an event to be preserved in a documentary manner. The second, which is far more widespread, is to treat the recording itself as the event, the performance, using live sounds purely as ingredients to be mixed and cooked. This, of course, is how all nonclassical recordings are made. The sound of an instrument or singer is picked up with one microphone, and the resultant mono signal, either immediately or at a later mixdown session, is assigned a lateral position in the stereo image with a panpot. As this is a device which by definition produces a ratio of amplitudes between the two channels, it would seem that every recording made this way is a true amplitude-stereo recording, capable of producing a well-defined stereo image.

Do such recordings have a soundstage associated with that image, however?


When producing such a recording, the producer decides how much and what type of reverberation should be associated with each of the mono sound sources, and also decides where in space that reverberation should be positioned. There is no reason at all why the ambience surrounding, say, a centrally placed lead vocalist, should have any relationship with that around the drums. Or the guitar. Or the synthesizer. And if it doesn't, then the listener doesn't hear a soundstage. Rather, he hears a collage of individual musical events, bearing no spatial relationship to one another.

Early stereo rock recordings, such as the Airplane's After Bathing at Baxter's, illustrate this graphically: while such a recording can undoubtedly be satisfying musically, a soundstage it just doesn't have. Since the late '60s, producers nearly always take care to coordinate the artificial ambience on rock recordings to result in the production of a convincing soundstage. Recordings from Paul Simon, Andreas Vollenweider, and Clannad, for example, create a wholly artificial, but nevertheless effective, soundstage hanging between and behind the speakers, which bears no relation to anything that might have existed in real life.

Footnote 1: Blumlein's bald paragraphs, written in inelegant, poorly punctuated, legalese, are concerned with preserving lateral directional information. John Shuttleworth, responsible for the excellent recordings on the Meridian label, has pointed out that, as the reverberation also has a contribution from reflections of the direct sound from the floor and ceiling of the room, the aural clues enabling a listener to infer image height will also be preserved. But this is dangerous ground; I will leave discussion of it for a future issue.

Footnote 2: The July 1986 issue of Studio Sound has a fascinating article by Michael Gerzon, of Ambisonics fame, outlining how boosting the bass of the difference signal in a true Blumlein amplitude stereo signal can beneficially increase stage width. This, again, is a direct corollary of the ideas suggested by Blumlein in 1931.


Doctor Fine's picture

I appreciate John's experience with miking technique.  What remains to be answered is the questiion of "just how much spatial information can a WELL recorded STEREO performance contain?"

Is there enough information so that not only left and right are in evidence but there is additional timing, reflection, and room reverberation information which would allow the human ear to determine DEPTH and spatial relationships in 3D between objects?

After all, the human ear is a hunting instrument.  If one is to launch an arrow at an object in space with the expectation of hitting something one must be in awe of the computational ability of the left-right ear to brain processing feat.  I don't know about YOU but I can estimate how far back on the stage something is during a live acoustic performance, with my eyes closed---IF my seats are ideal.

I might even be able to bring down a bird in flight with an arrow simply by the SOUND of its wings beating through the air.  And that requires THREE axis determinations.  LEFT-RIGHT-PLUS DISTANCE/DEPTH.

I maintain that were a recording engineer to posess monitors capable of 3D imaging that engineer could cleverly assemble enough depth information sufficient to throw a 3D picture on the consumer's equipment.  IF the consumer ALSO has monitors caqpable of timing information...

And just how important is the pursuit of such playback?  Well, gee, let's think about this...  How about because we say we are STEREOPHILES we need to include three dimensional stereophonic results as part of our basic repertoire?  Or we are FAKES.

It makes me more than a little bit angry that this subject is never brought to the fore.  In my 50 years of building and selling price no object stereo systems I have never met anyone out there who thinks this criteria important---except myself.  I feel a little bit like the promise of "stereo" has never been truly realized.  

I hear lip service paid to something called "imaging" yet I never walk into a three dimensional holograph of the soundstage when auditioning "the great work of HiFi giants" at shows, in demo rooms and the like.  My fellow hobbiests often drag me over to listen to their expensive gear which has for the most part simply been taken out of the box and stuck someplace it "looks right."  Most place a pile of expensive gear smack dab in between their speakers, thus hopelessly confusing the "image." 

"Look at my expensive gear!" they say.  Isn't it AMAZING?  Well, no.  Not so much.

So sad.