The 2011 Richard C. Heyser Memorial Lecture: "Where Did the Negative Frequencies Go?" Case Study 1: Recording

Case Study 1: Recording

Back in 1987, the AES published the anthology pictured above of historic papers on "Stereo." It includes a document (celebrating its 80th anniversary this year) that pretty much defined the whole field of stereo reproduction, including the 45°/45° stereo groove and the moving-magnet stereo cartridge. That document, a 1931 British Patent Application written by the English engineer Alan Dower Blumlein, is worth quoting at length:

"The fundamental object of the invention is to provide a sound recording, reproducing and/or transmission system whereby there is conveyed to the listener a realistic impression that the intelligence is being communicated to him over two acoustic paths in the same manner as he experiences in listening to everyday acoustic intercourse and this object embraces also the idea of conveying to the listener a true directional impression. . . . An observer in the room is listening with two ears, so that echoes reach him with the directional significance which he associates with the music performed in such a room. . . . When the music is reproduced through a single channel the echoes arrive from the same direction as the direct sound so that confusion results. It is a subsidiary object of this invention so to give directional significance to the sounds that when reproduced the echoes are perceived as such."

In other words, if you can record not only a sound but the direction in space it comes from, and can do so for every sound wave making up the soundstage, including all the reflected sound waves (the reverberation or "echoes"), then you will be able to reproduce a facsimile of the original soundstage, accurate in every detail. In addition, because the spatial relationship between the direct and the reflected sounds will be preserved, that reproduced soundstage will give a realistic illusion of depth.

Incidentally, I mentioned earlier Hermann Bondi, one of Hoyle's collaborators on the Static Universe Hypothesis. Like Blumlein, Bondi had worked on the British development of radar in World War II. When I worked in the research lab developing LEDs, in a corner office was a charming elderly gentleman, Dr. Henry Boot. Only years later did I learn that Henry was one of the people who invented the cavity magnetron, which was fundamental to the British development of radar. I suppose you could therefore say that there are just two degrees of separation between me and Alan Dower Blumlein.

The Blumlein Patent Application mentions that, when recording for playback over headphones, the simplest way of carrying out the preservation of the soundstage is to use two microphones spaced as far apart as the average pair of human ears: the "binaural" technique. This, however, makes headphone listening mandatory; until recently, headphones have been about as popular as a head cold for relaxed, social listening. Blumlein was concerned with a system for playback over loudspeakers, and proposed a method of recording directional information as a ratio of amplitude differences between the two signal channels.

The ear/brain, of course, uses more than amplitude information to determine the direction of sound sources. It uses the amplitude difference between the signals reaching the two ears above about 2kHz, but below about 700Hz, it determines direction by looking at the phase difference between the signals; ie, it uses time-of-arrival information. (Both frequencies are proportional to the head size, so there will be a spread among individuals.)

Things get a bit ambiguous between those two frequencies, but there are two other mechanisms also at work: first, the frequency-response modifications due to the shape of the pinnae differ according to the direction of the perceived sound; and second, the head is in continual lateral motion, sharpening up all the mechanisms by introducing second-order (rate-of-change) information. The result is that human beings—and animals—are very good at determining where sounds come from (unless they happen to consist of pure tones in the "forbidden" region between 700 and 2000Hz, which is why birds, for example, use such tones as warning signals).

Blumlein's genius lay in the fact that he realized that the low-frequency phase information can be replaced by corresponding amplitude information. If you have two independent information channels, each feeding its own loudspeaker, then the ratio of the signal amplitudes between those two loudspeakers will define the position of a virtual sound source for a centrally placed listener equidistant from them. For any ratio of the sound levels of the two speakers, this virtual source occupies a dimensionless point somewhere on the line joining their acoustic centers. The continuum of these points, from that represented by maximum-left/zero-right to that represented by zero-left/maximum-right, makes up the conventional stereo image. If there is no reverberant information, then the brain will place the virtual image of the sound source in the plane of the speakers; if there is reverberation recorded with the correct spatial relationship to the corresponding direct sound—that is, if it is "coherent"—then the brain places the virtual image behind the speakers, the exact distance depending on the ratio of recorded direct sound to recorded reverberant sound.

Thus, by recording and playing back just the amplitude information in a two-channel system, we can create a virtual soundstage between and behind the loudspeakers (footnote 6). And if, instead of capturing an original event, we record many individual sounds in mono and assign each one a lateral position in the stereo image with a panpot (along with any added reverberation or echo), when we mix down to stereo, again we have a true amplitude-stereo recording. It is fair to say that 99.99% of all recordings are made in this way. It is so fundamental to how recordings are now made that I doubt if anyone thinks about the fact that it is based on psychoacoustic sleight of hand: the substitution of amplitude ratios for time-of-arrival differences in the midrange and bass.

For many years, I was a hard-line Blumlein purist when it came to classical recording. I was attracted by the theoretical elegance of the M-S technique—a sideways-facing microphone with a cosine or figure-8 pickup pattern is spatially coincident with a forward-facing mike; sum-and-differencing the mike outputs gives you true amplitude stereo—and of two figure-8 microphones horizontally coincident at 90°, each positioned at 45° to the forward direction. Of all the "simple" techniques used to capture live acoustic music, these two, in all their ramifications, are the only ones to produce real stereo imaging from loudspeakers.

I used to dismiss with a snort recordings made with spaced microphones. After all, if the microphones are separated in space by a distance larger than the wavelength of most of the musical sounds—10', say—unless an instrument or voice is exactly halfway between the two microphones, there will be, in addition to the amplitude information, a time delay introduced between the electrical signal that voice or instrument produces in one channel and the signal it produces in the other. Such time information pulls the image of the source farther toward the nearest speaker, resulting in an instability of central imaging and a tendency for sources to "clump" around the speakers. Add to that the fact that the interchannel amplitude differences produced by spaced microphones do not have a linear relationship with the angular directions of the sound sources, and it is hard to see how a pair of spaced microphones can produce any image at all.

Yet . . .

In 1992, we were recording two concerts for Stereophile featuring classical pianist Robert Silverman.

The main pickup was with a single stereo microphone, but I had put up a pair of omnis that I fed to a separate recorder. After the concert, it became apparent that the stereo mike had failed, so I was forced to use the spaced-omni recording for the CD release. Here is a short track from that album, Schubert's Moment Musicaux No.3:

[Play Schubert Moment Musicaux No.3, from Concert CD, Stereophile STPH005-2 (1994)]

There are two things intriguing about this recording. For this lecture, one minute in, I flipped the right channel's polarity. I doubt that anyone noticed—there is so much time disparity between the two channels that it cannot be considered a stereo recording at all; rather, it is two different recordings of the same performance that happen to be played back simultaneously. The second thing is that, despite that theoretical imperfection—for which I was duly castigated on Usenet—the CD sold quite well. People liked the sound.

I wasn't too surprised by that. Theoretically perfect amplitude stereo has served us well, but when I played people some of my classical recordings made in the appropriately purist manner, they often described the sound as "thin" or "cold" or "lacking bloom."

As I said earlier, when people say they like or dislike something, you should take notice. And in this instance, the late Michael Gerzon had discussed the matter in a paper he gave to the London AES Convention in 1987. Specifically, he had postulated that Blumlein's substitution of amplitude for phase differences at low frequencies is inadequate, that people prefer the sound when there is some time-difference information between the channels, presumably because the information their brains use to synthesize a model of the stereo image now has more in common with what they would have heard at the original event.

Gerzon had floated the idea of using two pairs of microphones to capture all the information the brain requires: spaced omnis below 1kHz; coincident figure-8s above, with a crossover between the two. I tried that, with disappointing results. However, after 1992, I used a similar miking technique with which I thought I could get the best of both worlds: the good amplitude stereo from coincident or quasi-coincident mikes, and the lower-frequency bloom from spaced omnis. Both mike pairs were used more or less full-range; the only EQ was a touch of top-octave rolloff on the omnis, and some first-order low-frequency boost on the cardioids to compensate for their premature bass rolloff when used distant from the source.

It was the acquisition of a Sonic Solutions Digital Audio Workstation in 1993 that allowed me to fine-tune this technique, because it became apparent that the two pairs of mikes needed to be time-aligned for the resultant stereo image to lock into place. This time alignment of mikes had been used by Denon and was described in an early 1990s AES convention paper, but I had no way of easily implementing this until I could slide individual tracks backward and forward in time to get the required synchronization.

Since then, I have made all my classical recordings in this manner. Here is a typical example: Minnesotan male-voice choir Cantus singing Eric Whitacre's Lux Aurumque in the glorious acoustic of Sauder Hall, at Goshen College, in Indiana. You can see the two pairs of mikes in this photograph.

Not shown in the photo is a third pair of mikes, omnis on a Jecklin disc, farther away from the singers, which I used in case it turned out that the main pickup was too dry. (When you are on location and the clock is ticking away your money, you cover your bases.)

[Play Cantus: Eric Whitacre, Lux Aurumque (excerpt), from While You Are Alive CD, Cantus CTS-1208 (2008)]

If you listen critically to this recording, you will hear that acoustic objects get a little larger, the farther away they are from the center of the stage. However, their spatial positions in the image are correct.

I tell this tale because it illustrates one of my points: that thinking you are right about something in audio doesn't mean you are right. No matter how much you think you know, there will always be new things that upset your world view. Einstein, for example, would be astonished to find that his "biggest blunder," the Cosmological Constant, turns out to be real—that we now are aware that something completely unknown to science is causing the expansion of the universe to accelerate. Physicists call it "Dark Energy," but that's just scientific shorthand for "We have no idea what it is."



Footnote 6: this creation of a virtual soundstage only works for soundsources to the front of the listener and a two-channel system. However, when the mixing engineer requires a virtual image to be placed to the listener's side in multichannel audio, it fails for the simple reason that we do not have a pair of ears on the front and back of our heads.
Share | |

X
Enter your Stereophile.com username.
Enter the password that accompanies your username.
Loading