Extraction vs Generation

Many years ago I bought the first model of the Audio/Pulse ambience synthesizer. Like many audiophiles, I was convinced (and still am) that the standard two-speaker stereo experience provides an unsatisfying concert-hall impression. But the Audio/Pulse didn't remain long in my stereo system. You see, at best the unit provided a fair reproduction of the sound of my upstairs bathroom, topped off with a nasty flutter echo. I already get that sound every morning in the shower.

Between 1973 and 1978, after escaping the execrable sound of the Audio/Pulse, I auditioned various other "ambience generators" but never bought another, finding that not a single home unit designed during that period had sufficient echo density or freedom from coloration to sound decent. Sure, some sounded pretty good when masked by typical orchestral music emanating from the front speakers, but when I played such music as solo harpsichord, solo voice, or almost any music with spaces in it, the reverberation was revealed to be what it really was—artificial.

Then, at a New York Audio Society Meeting, former president John Marovskis (designer/developer of the Janis subwoofer) clued me in to a wonderful process he was using called "ambience extraction," also known as "ambience recovery" or "ambience decoding." As John explained it, his ambience extractor uses a single wide-bandwidth delay line of approximately 30 milliseconds (to the effects speakers). There is no recirculation or multiple echo or reverberation effect; that is, no artificial ambience generation. The ambience extraction device works strictly according to the psychoacoustic Haas effect. In other words, a correlated sound (such as the direct sound from a musical instrument) leaves the front speakers and arrives at the listener's ear. Then, 30ms later, the repeat of that initial sound arrives from the location of the side speakers. Fortunately, the Haas effect lets the listener totally ignore the second sound; his ear/brain successfully locates the source of the sound at the front loudspeakers. The net result is an apparent increase in sound level but no change in image location. The Haas effect remains valid for delay times up to about 40ms (depending on the percussive nature of the material), and the effects loudspeaker may be up to about 10dB louder than the main speakers. If these limits are exceeded, the fusion breaks down and the listener hears a discrete echo from the effects loudspeaker(s).

But what does the Haas effect have to do with ambience? Well, the delay line is quite ecumenical—it treats the direct sound and the ambience in exactly the same way. Except, since ambience is uncorrelated (footnote 1), the ear/brain combination does not recognize its delayed replica as a distinct repeat, the ambience is not masked, and the brain detects (extracts) the ambience coming out of the delayed loudspeakers (footnote 2). It turns out that the ambience-extraction ability of the brain is a very powerful working mechanism; unfortunately, only a few manufacturers have taken advantage of it, and compared with big companies like Yamaha, they are either on the fringe of the marketplace or out of business (eg, Benchmark, with its much praised ARU).

If ambience extraction boxes work so well, why are they scarce as hen's teeth, while ambience generators, even the bad-sounding ones, are sprouting up on every corner? My explanation is that it is very easy to sell and demonstrate the capabilities of an ambience generator to the average audio customer, who hasn't been to a live concert in years. The typical consumer can't resist the hyped-up sound which the generator boxes are capable of producing. On the other hand, Francis Daniel (creator of the Benchmark ARU) lamented that he could hardly find a dealer capable of adequately demonstrating the subtleties of ambience extraction. Francis's box did not call attention to itself, it just reproduced a natural ambient field when playing well-recorded stereo records. Nowadays, with the popularity of Dolby surround boxes, Benchmark could stress its audio-for-video capabilities, and let thousands of potential video-oriented sales support the few sales to quality-minded audiophiles. But the box was very expensive to produce, and Benchmark will probably not go back in business.

The Marvels of the Benchmark
The unit John Marovskis told me about was a Sound Concepts modified for extended bandwidth (about 12kHz) and no recirculation. These are two of the keys to ambience extraction. If the Benchmark had not appeared I probably would have bought a Sound Concepts and modified it. But the Benchmark ARU (ambience recovery unit), had everything: six outputs, 12kHz bandwidth, and a L–R (Hafler) matrix, another important contributor to its sound. Its two rear outputs were derived from the mono L–R signal, but Francis inserted a polarity-inverting amplifier on the left rear output, having discovered that placing rear loudspeakers out of polarity (phase) with each other enhances the ambient effect (footnote 3). Its two stereo "side" outputs were simply delayed replicas of the front left and right channels. These, plus other features (including remote control), helped to justify the Benchmark's original premium list price of around $900.

The Benchmark box worked very well in my system until I started playing Dolby Stereo movies at home. Its signal/noise ratio, which was good enough for classical music, proved to be inadequate for movie sound effects, which are played at a higher level than ambience. After those jet planes stopped flying around my living room, I noticed a field of hiss, unfortunately not part of the intended sound effect.

The Rise (and Fall) of the Phoenix
I've finally settled on a little-known device costing only $250. It is the Phoenix Systems P-250 Delay Enhanced L–R Decoder—and that's exactly what it does. Phoenix Systems used to sell it as a kit, but since designer John Roberts sold Phoenix to Rhoades, I don't know what will become of that company. That's sad, because the box is cheap, very quiet, and has almost a 20kHz bandwidth. All in all, the Phoenix is a good compromise for the audiophile who wants the best in ambience extraction and also watches Dolby Stereo movies. You'd have to spend over $800 to get a superior ambience decoder, such as the Yamaha DSP-1 in "Surround 1" mode. A better unit for the Dolby films is the Shure HTS, whose extended logic circuits exhibit great channel separation, but poor ambience extraction, in my opinion, compared with the other boxes.

Listening tests: Yamaha's DSP-1 as Ambience Generator
Which brings us to a sonic evaluation of the Yamaha DSP-1, recently obtained on loan. Bill Sommerwerck has already described the unit's superior ambience-generation capabilities in his review in June 1987. I was familiar with the DSP-1's potential for reverberation synthesis, having used its first cousin, Yamaha's "professional" SPX-90, in the recording studio. Frankly, the experience with the studio unit prejudiced me against the Yamaha. In one 24-track studio I have numerous digital reverberation processors, many costing over $10,000. One of them, Lexicon's new model 480, uses 18-bit processing and sophisticated room modeling to produce a very convincing ambient field. In that studio, we only use the SPX-90 for special effects: flanging reverb, multiple echo, pitch change, and so on. As a standard reverb, however, the Yamaha's artificial space is less convincing than that of the Lexicon, which in turn is still distinguishable from a good acoustic reverberation chamber.

In studio use, we depend on a reverberation generator to create the entire ambience of the recording (footnote 4), but in home use (as Bill Sommerwerck stresses), we will be using the Yamaha in a subtle manner to supplement the natural reverb already on the recording and spread it around the space of the listening room. So I tried not to be prejudiced during listening tests of the DSP-1's ambience generator, hoping that its sound would be masked by the natural ambience on the original recording.

Footnote 1: When used in audio, the term "cross-correlation" is a mathematical way of describing the degree to which a sonic event is related to another event occurring at a previous or succeeding moment in time. Ambience has a very low degree of cross-correlation. Mathematically inclined readers will find a discussion of cross-correlation in Madsen's article in the October 1970 JAES.

Footnote 2: Bill Sommerwerck describes the brain's ability to extract ambience in his June DSP-1 review. But since he prefers ambience generation to extraction, I have provided my own explanation of the latter.

Footnote 3: He could have specified that the user install one loudspeaker out of polarity, but Francis did not want to complicate the instruction manual.

Footnote 4: This is generally true for pop music recording, where we usually use multi-mike techniques, and (hopefully) less true for classical music recorded in natural spaces with simple mike techniques. Bernstein's recent West Side Story recording was made in a relatively dry studio with multiple microphones and artificial reverb. Do you think this a natural-sounding recording?