Audio, Precision, & Measurement: Richard Cabot Page 4

They listened to it and found it sounded terrible. They started doing things like turning off the compression and just putting the signal through their system and back out again with no compression. It still sounded terrible. They ultimately determined that the problem was all these filters that broke the signal apart and put it back together again. They were all linear-phase filters, and when you plotted the frequency response, it was ruler-flat from DC to Nyquist (footnote 2).

But the problem was that those tiny ripples in the frequency response, just fractions of a dB peak to peak, really represented the preshoot and ringing of the filters. They had very sharp filters to get the 40Hz-wide bands. Those involved ringing not only in the frequency domain that involved 0.01dB, but also were visible in the time domain. The narrower you make your filter, the longer it rings in the time domain. These very small ripples in the frequency domain translated into a pre-echo on the signal that was 50dB down that had happened many milliseconds before the signal. Fifty milliseconds before a loud transient came along, you heard this thing 40-50dB down. It was not a hard effect to hear.

A similar thing was found by Floyd Toole when he was studying the audibility of resonances in speakers, very high Q, very sharp resonances. If you made a very-high-Q resonance, you could adjust its amplitude. He set up an experiment where he took a signal and delayed it through a very-high-Q filter, then added it back into the audio. He adjusted the amplitude of the signal before it got added back into the audio to find out at what level you could hear this delayed reflection and how sharp was the filter. He changed the width, the amplitude, and the delay. He could create delayed resonances that were clearly audible on some types of program material. But when you looked at the frequency response, you saw 0.1dB to 0.5dB of ripple, a very small amount that would be visible when you did a simple amplitude vs frequency plot. But if you looked at the impulse response, you would see the ripples delayed in time. The ear picked up that ring that trailed along in time very well.

And with digital audio, you have not only the post-echo—the ring afterward that you get from loudspeaker cabinet reflections as the resonances die away—but you get the pre-echo, because everyone tries to make linear-phase filters. The pre-echo becomes even more audible because you haven't had the loud signal to mask it out. Whatever happens happens in dead quiet.

Harley: How far are we into the learning curve in digital audio? Is there a body of knowledge waiting to be discovered, or will improvements come from refining the performance of known parameters?

Cabot: There is certainly a lot we can improve. I think that you'll find that digital signal processing is going to go through a lot of growth in what it can do. It has a lot of nifty things it can offer, but there will be a lot of mistakes made in the first few years.

Because it's a relatively new field, people arbitrarily say, "Well, it's all digital—there won't be side effects." I think there's a lot to be found in digital audio. How much further is CD playback itself going to mature? I'm sure improvements will be made as people get the bugs out of these single-bit, highly oversampled converters.

I'm amazed to continually find oddities in digital audio, and say, "Why would they ever do that? They could have done it in a smarter, better way." There's an R-DAT recorder on the market that I measured at the European AES. R-DATs are supposed to record data as data. I wrote the AES paper on measuring digital interfaces, and I'd written a program that measured digital error rates. The purpose of the program was to check out digital interfaces, find problems, detect dropouts, and things like that on the tape—the kinds of problems that would create a data error. We hooked up the tape machine to the System One and we were getting errors continually. We were running digital data out of our system and into the recorder and taking the digital signal back out again. There were errors every sample.

I said, "What's wrong?" It turns out that this recorder puts a digital filter in line with the digital data all the time. It is impossible to record digital audio on the tape without modification. It doesn't record the digital data that you feed it—it records filtered digital data. They have a 0.5Hz high-pass filter. You may think that half a hertz isn't going to matter a lot—it's so low in frequency you won't hear it. But the digital filter was not designed as well as it should have been. If you know anything about digital filters, you know that as you go lower in frequency, they're harder to make. You have to look at more and more samples and the round-off errors become a bigger and bigger problem. Because you are looking over thousands of samples, the error adds up to become a big error. That gives you noise-floor and distortion problems in the filter. And when we measured this thing—digital input to digital output, no converters involved—we found that, instead of the -98dB noise floor it should have had for an ideal 16-bit system, the noise floor changed between -93 and -95dB. It would go up and down and move around at about a half a hertz rate. This is a current-generation, portable, professional recorder that's on the market.

These things down at low frequencies, frequencies that you would have thought, from the frequency-response plot, should be below where they would be an issue, create problems in the audio band at several kilohertz in the noise floor. Because the noise floor is moving around. I would not be surprised if that was audible.

Filtering the audio in a digital recorder doesn't seem reasonable. You can see it in the noise floor—several dB of noise-floor pumping. If you were to dub from one machine to another a couple of generations, you're no longer talking about a -95dB noise floor; now you're into the -86dB or -88dB region. It just doesn't seem reasonable on a digital machine to have the noise floor get worse as we dub digitally from machine to machine. That's what digital is all about: once you get it into bits, you're not supposed to screw it up. Here is somebody screwing it up.

A fellow that I know in Britain who has played around with digital audio quite a bit did some experiments measuring the digital data coming off the disc and looked for signs of poor quality A/Ds. He developed some ways of spotting them which I found interesting. He wrote a program that produced histograms of the digital audio data, how often each digital word had occurred. When you look at the histogram plots for discs that were made with poor A/D converters, you can find bins that are either too long or too short. Look at a long piece of music—an entire song or a movement in a symphonic work—you're dealing with an awful lot of data and the distribution of that data ought to be some sort of smooth curve when it's averaged over a whole song. If you look at neighboring bins and one of them is significantly higher or significantly lower than those next to it, it means that this sample happened a lot more often than it should have done, which in turn means that the converter had a non-linearity there. The bit level was off or discontinuous. He also found that if you look at groups of bins from codes that are near each other, you can spot other patterns having to do with limited slew rate in the sample-and-hold circuits. He figured out ways of analyzing this that were really quite interesting. By looking at several commercial discs, the histograms indicate the quality, the rough performance, of the A/D converter.

I've got a CD at home that when I play it, I can hear low-level distortion plain as day. There are a lot of low-level problems in digital audio. I think it's all stuff that people know how to improve but haven't felt the need to give it their attention.

Footnote 2: The Nyquist frequency is half the sampling rate, and the highest frequency that can be sampled by a given sampling rate.