A Chatbot's Take on Hi-Fi Issues

There has been much discussion lately about ChatGPT, the machine-learning– based chatbot from OpenAI. Some experts say it will soon make human writers obsolete. Will that include human hi-fi reviewers?

I decided to engage ChatGPT in an exploratory conversation; think of it as a sort of job interview. What follows has been intensively edited; many words have been removed, but none have been altered. I'm including my own comments in italics.

ChatGPT: Hi-fi,or high fidelity, refers to audio equipment and systems that are designed to reproduce sound as accurately and faithfully as possible. This typically involves using high-quality components and minimizing distortion, noise, and other artifacts.

Jim Austin: Some people prefer "analog" sources—vinyl records and reel-to-reel tape—while others prefer digital technologies, including high-resolution digital files. What is your take?

ChatGPT: [Analog] formats can produce a warm and natural sound, and many audiophiles and music enthusiasts prefer the tactile experience of handling and playing physical media. However, analog formats can also be prone to noise, such as pops, clicks, and hiss, and they can degrade over time. Digital audio formats, on the other hand, use digital signals to encode and decode audio information. Digital audio offers several advantages over analog formats, including the ability to store large amounts of music in a compact space, the ability to easily copy and transfer audio files, and the ability to more easily manipulate the audio signal. Digital audio is generally considered to be more resistant to degradation over time than analog formats.

ChatGPT is making some solid points but leaving out crucial technical information, such as digital's superior S/N ratio and dynamic range capability.

Jim Austin: One key controversy in hi-fi is about how audio performance should be judged, between those who value measurements most and those who think that to judge music reproduction you must listen.

ChatGPT: Proponents of [the objectivist] approach argue that objective measurements can provide a reliable and unbiased way to evaluate audio performance and that they can help identify the best products and techniques for achieving high-quality sound. [The subjectivist] approach relies on the listener's subjective judgment of factors like the clarity, balance, timbre, and overall musicality of the sound. Proponents of this approach argue that subjective listening is essential to understanding the musical experience and that objective measurements do not always accurately reflect the perceived quality of the sound.

I asked what parts of a hi-fi system are most important. ChatGPT refused to commit, but gave surprising prominence to cables:

ChatGPT: There is scientific evidence that suggests that cables can have a measurable impact on the quality of the audio signal. For example, cables can introduce resistance, capacitance, and inductance into the signal path, which can affect the frequency response and phase relationships of the signal. Cables can also pick up electromagnetic interference (EMI) and radio frequency interference (RFI) from other electronic devices, which can introduce noise and distortion into the signal. These effects can be quantified using objective measurements, and they suggest that cables can have a measurable impact on the sound.

On the other hand, some people argue that the differences introduced by cables are too small to be audible and that the perceived differences in sound quality are due to psychological factors or to differences in the other components of the hi-fi system. Some studies have attempted to quantify the audible differences between different types of cables using double-blind listening tests, but the results of these studies have been mixed and are often disputed.

A reasonable take. I then chose a hi-fi topic at random—MQA—to see how far ChatGPT would follow me into hi-fi's weeds.

ChatGPT: MQA technology uses a combination of techniques to capture and reproduce high-resolution audio, including advanced audio compression algorithms, digital filtering, and time-stretching. MQA files are typically encoded at a higher bit depth and sample rate than standard audio files, which allows them to capture a greater range of audio information and produce a more detailed and accurate reproduction of the original sound.

ChatGPT, here, is confused about several points, but the term "time-stretching" is especially curious. I asked for more information.

ChatGPT: In MQA technology, time-stretching is used to adjust the length of an audio signal in order to match the original recording. This is done by analyzing the audio data and identifying the musical structure of the signal, such as the tempo and the rhythms of the individual instruments. The audio data is then processed to stretch or compress the signal as needed to match the original recording.

I commented: "MQA does talk about time-smearing, but I've never heard 'time-stretching' in the context of MQA." This led to the first of several apologies.

ChatGPT: I apologize for the error. You are correct that MQA technology uses a process called "time-smearing" rather than "time-stretching." Time-smearing is a technique that is used in MQA technology to reduce the amount of data required to represent an audio signal, while still preserving the full range of audio information from the original recording.

ChatGPT now seems to be confusing time-smearing with Audio Origami. I pointed out the error. First another apology, then:

ChatGPT: You are correct that time-smearing in the context of MQA technology refers to the effect of digital-to-analog and analog-to-digital conversions on transients in the music. To address this issue, MQA technology uses a technique called "time-smearing" to compensate for the effects of time-smearing on transients in the music.

Wait, MQA uses time-smearing to compensate for time-smearing?

ChatGPT: Yes, that's correct! MQA technology uses a technique called "time-smearing" to compensate for the effects of time-smearing on transients in the music.

I came away impressed by ChatGPT's compositional abilities—its ability to summarize and present information coherently—but I was surprised (even stunned) by its strange errors and how it persisted in defending those errors almost as if trying to save face—not unlike certain audio reviewers. Still, it will be a while before Stereophile employs chatbots to write its reviews.

COMMENTS
Lars Bo's picture

One "overall musicality" and also a "the musical experience" ain't bad for a robot. :-)

cognoscente's picture

I think this is a really nice and interesting post, thank you

ChatGPT is something like navigation in a car, people just drive into a river or ravine. Always think critically yourself whether it is correct, after all, it is no more than the sum of the average opinion that can be found on the internet, what people say there. I never use reviews like TripAdvisor. The majority is certainly not always right. As far as I'm concerned, hardly ever, but you can accuse me of being elitist. Whatever, I'm even proud of that. Moreover, (AI) remains no more than just and only calculations and our brains work differently.

cognoscente's picture

in other words, I care more about what Jim thinks about audio than ChatGPT (but then I really make my own opinion about it).

Poor Audiophile's picture

"Still, it will be a while before Stereophile employs chatbots to write its reviews."
Maybe they can edit it.

rschryer's picture

I giggled

georgehifi's picture

ChatGPT:"[Analog] formats can produce a warm and natural sound, and many audiophiles and music enthusiasts prefer the tactile experience of handling and playing physical media. However, analog formats can also be prone to noise, such as pops, clicks, and hiss"

He's correct for older vinyl music, it sounds warmer and more natural.
But play those same songs on CD that are taken from the master tape, and suddenly you have up to 120db channel separation, and it sounds very left right ping pong sounding. Where with the vinyl you have at best 35db channel separation through the mids, virtually 0db in the bass and around 15db in the highs because of the phono cartridge and sounds better with those albums.

I found there is a cure to getting that similar vinyl warmth to those early recordings and richness, and loosing the ping pong effect with digital CD.
What you do is have a left right output blend of the analog output on the CD player, to purposely bring the 120db channel separation down to 30db top to bottom using in my case a small 1kohm or was it 100ohm resistor between left and right output, this doesn't copy exactly the channel separation of a phono cartridge but you'll get the idea.

Cheers George

ChrisS's picture

Spend a little more money, get better equipment and better recordings, and you don't have to do any of that!

georgehifi's picture

You need to read a little better and let it sink in, it's the poor channel separation of vinyl phono cartridges that mono'izes, that's the secret to it sounding richer/fuller with the early stuff.
Read my post again with an open mind and this time let it sink in.

Cheers George

ChrisS's picture

...convincing.

Chatbots don't have stereo systems.

ok's picture

..all second-hand tech talk and no first-hand hearing impressions.
Its views on cables are interesting though :-)

cognoscente's picture

I don't see ChatGPT as "it" but as "we". After all, it is the average of what we all say on the internet.

That's (looks like) a strength, but I think it's actually a weakness. I only want to read the opinion of experts I trust and have chosen myself, such as Jim, and not of everyone, the opinion of the street so to speak.

I also notice that ChatGPT mentions both sides so strikingly. That seems sympathetic ("charm offensive?"). It is now still objective (is there such a thing as objective?), so this is all we say about it and "make up your own mind". But therein lies the danger. What if a regime or tech company wants to influence the outcome for political and commercial reasons. The way google and FB works (or WHAT*HIFI?) , both no more than an advertising agency. Or in countries like China, Russia, Iran, Turkey, Hungary & Poland (and then we're already getting close) ... actually the entire non-so-called non-Western countries we can already imagine well. But also in the so-called free countries we know that the government does things and does not want some things in public, so that will not be found with ChatGPT, which means that it is immediately not objective.

Since the emergence of navigation in cars, I notice that people no longer have any geographical insight independently. People have completely lost their sense of direction. And no longer think independently, totally dependent on the navigation (and it cotst!). The same thing will happen here with ChatGPT. We are becoming more and more stupid and most of all paying slaves. Exaggerated? Perhaps but how much from 0 to 100%? Just think about it and "make up your own mind".

cognoscente's picture

but on the other hand, it is perhaps better that we make ourselves, humans, superfluous. A blessing to the earth. After all, we're just an intermediate stage...as dinasours once were, only they didn't bring about their own extinction.

the flatearther's picture

GPT is a language model and have no idea what it is talking about. it does not know the difference between my grandmother and a speaker cable. they have read in everything published the last 2-3 years on the internet and in order to be able to do that they use a kind of lossy semantic compression. when they synthesize the answers they make assumptions assumptions to fill in the blanks. when subject is obscure it can go very wrong - the correct term they use to describe this is that GPT "hallucinates". if I remember correctly the level of halucinations on complex answers is 20% right now. they claim the next version is imminent and will reduce the level of hallucinations. when GPT is used to produce web content it will become interesting when GPT will try to interpret its own synthesized output. over the years it will be lossy compression on lossy compression on ...

Jorgitok's picture

Eventually ChatGPT will be very easy managed by the corresponding editorial committee to for example: do positive reviews, help to sell equipments, with no conflict or rebellions. Now we are not so far, right?

DH's picture

Is that saying:
MQA technology uses a technique called "time-smearing" to compensate for the effects of time-smearing on transients in the music./

is actually correct, although the author of this piece and some others who write for the magazine refuse to acknowledge it.

Archimago's picture

ChatGPT got it more right than apparently some magazine writers realize even though the facts have been out there for ages. ;-)

ok's picture

since mqa is merely a form of loudness.

vicweast's picture

it's actually just nonsense in part because its definition is circular.

The article really illuminates the difference between regurgitating facts in a semi-intelligent manner -with- true understanding of the domain.

Hint: It's also hard to identify such lack of understanding in actual people.

GPT-3 (which is what ChatGPT is based on) famously answered the question of how to stop a door from dragging on the floor by stating that you need to bring a table saw to the door and while the door is still mounted use the table saw to trim the bottom of the door... This is *not* understanding.

Ktracho's picture

It reminds me of comparing photos of real people vs. fake photos a few years back. The fakes looked pretty realistic until you look at the small details, like teeth that are fused together or impossible folds in the clothing. It'll get better till it reaches the point of replacing Wikipedia.

bhkat's picture

I wonder what chatGpt's unfiltered alter-ego would say about these topics.
His name is DAN(Do-Anything-Now) and it gives non-politically correct answers if it is kept being told to stay in character.

Honesuki's picture

A superficial grasp of subjects, parroting what it has been told, and acting defensively when confronted with error.

Indydan's picture

Nice way for Stereophile to give MQA more free publicity...

Either way, that Chatbot sounds less full of shit than Bob Stuart and his MQA brethren.

X