Sony SA-Z1 nearfield active speaker system

On top of a desk, an audio system must be able to deliver satisfying sound in a nonoptimal environment: a flat, reflective plane (the desktop) cluttered with keyboard (or maybe a laptop computer); a mouse; assorted papers, books, and trash; and, perpendicular to that, another flat, reflective plane (the computer display), which, if it's not a tiny laptop screen, will block some of the soundwaves emanating from the speakers. Such sonically inhospitable spaces can consign 3D stereo imaging (and other desirable sonic traits) to the realm of the imagination.

My workspace is especially challenging. My two-tier triangular desk is wedged into a corner and bisected by a diagonal roof line that cuts across the listening space at a 142° angle. You'd think that would boost the bass, but somehow, instead, it traps it.

I decided to try Sony's SA-Z1 nearfield active desktop speaker system ($7999) to see how well it could cope with my workaday acoustical nightmare.

The only thing it shoots is sound
The SA-Z1 speaker system, with a unique appearance that evokes some futuristic weapon from a Marvel cinematic thriller, is the latest addition to Sony's Signature series, which also includes a headphone amp (TA-ZH1ES), two high-end Walkmans (NW-WM1Z, NW-WM1A), a couple of high-end headphones (MDR-Z1R, IER-Z1R), and the DMP-Z1 Digital Music Player that John Atkinson reviewed in the August 2019 issue of Stereophile. The series originated during the 75th anniversary year—2016—of the company that brought you the Walkman and is dedicated to creating the ultimate personal listening experience. Yoshiyuki Kaku (Kaku San), who designed Sony's SS-AR1, SS-AR2, SS-NA2ES, and SS-NA5ES speakers and appears in a seductive video on Sony's SA-Z1 website, also designed the SA-Z1, in cooperation with electrical designer Masaki Sato.


The SA-Z1 system consists, of course, of two speakers. Each one contains a ¾" tweeter with a titanium-sputtered soft dome flanked vertically by two smaller, similar tweeters, all mounted on a narrow, tapered plate in front of the front baffle. Mounted on that front baffle is a 4" forward-firing woofer with an anodized aluminum cone; behind that is a similar 4" woofer that faces to the rear. Various buttons and controls sit atop each speaker.

The SA-Z1 system took three years to develop. "That's a really long time for Sony," Greg Carlsson, a San Diego– based electrical engineer and member of Sony's senior staff who participated in one of the SA-Z1's prototype listening stages, told me during an extended chat, which also included Kevin Portaro, Sony's very helpful senior product marketing specialist. "It's a really complicated digital-hybrid design that took an hour's conversation with Masaki Sato for me to fully understand," Carlsson told me.

Like most desktop speaker systems, the SA-Z1 includes a DAC, but it does not use an off-the-shelf, chip-based DAC. At its heart is an FPGA that converts all music data to PWM—pulse-width modulation, the generic term for DSD, which is a trademark. PWM and class-D amplification have many similarities: Both are single-bit technologies, meaning that at any given time, the signal occupies one of two levels, high or low, fluctuating between those two states at very high frequencies.

Sony takes advantage of the similarity between the two technologies to combine digital conversion and amplification. "Sato-san calls it a 'power DAC' because it amplifies the PWM or DSD signal before filtering it," Carlsson explained. "The amplifier and DAC are together as one; they are not separate stages. This is effectively a discrete DAC and Class-D amplifier design.

"In parallel, the PWM or DSD signal is routed through another DAC, the output of which is amplified in the analog domain and used as a feed-forward signal for error correction. It's a very unconventional and challenging design, but the result is a lower noise floor and lower distortion. Since there are two paths, we call it a Digital Analog Hybrid, or D.A. Hybrid, amplifier. It's quite complicated, and I'm struggling to explain it in a simple way because it's hard to wrap your head around."

Rather than employing MOSFET transistors, which Sony says can produce ringing distortion due to slow switching speeds, the amps use gallium nitride (GaN) transistors, which allow faster switching and so less ringing. Sony claims that with GaN, "amplification errors are significantly reduced, even before the signal is error-corrected by the feed-forward analog amplifier."

Yes, there's an analog amplifier, too, another way in which the SA-Z1 is a digital/analog hybrid. Every woofer has its own amplifier pair, as does the main tweeter; the two assist tweeters are driven together by the fourth amplifier pair. That's eight amplifiers per side.

"The SA-Z1's D.A. Hybrid Amplifier design was initially deployed in our high-end TAZH1ES Signature-series headphone amp," Carlsson said. "We don't think that there's anybody else out there that's been able to pull off this kind of design. A lot of people speak about high-powered FPGA or gallium nitride MOSFETs, but this whole D.A.–hybrid amplifier design is quite unique."


"With a conventional speaker system, getting the room-speaker-listener positioning just right to achieve an expansive soundstage that extends to all the way beside you or even further is very difficult," Carlsson stated. "The SA-Z1 system achieves an amazingly large soundstage easily, and there is much less influence from the room when you listen in nearfield," because "in the nearfield, there is a lot less impact from early reflections."

Sony calls its unique coaxial tweeter array "I-Array." "A larger dome tweeter can produce a lot of sound pressure, but it has narrow directivity, which means that the high frequency response starts to drop off as you move off-axis," Carlsson explained. "Smaller tweeters such as our assist tweeters can't produce as much power (SPL), but they have much wider directivity and higher bandwidth, in this case up to 100kHz. Putting the three tweeters together in the I-Array creates a larger and more transparent soundstage and combines the strengths of the larger- and smaller-aperture tweeters in a coaxial layout for better imaging. ... [T]he coaxial design allows us to achieve coherence in the nearfield.

"Time coherence and a flat, broad frequency response are also required to achieve a coherent impulse response. Typical multiway speakers can have good frequency response, but their impulse response is compromised, because getting the wave fronts aligned in time between driver units is very hard due to driver placement and passive crossover networks. We also use the FPGA to synchronize all drivers and achieve full time-alignment, which ensures a coherent soundfield. Any music signal can be described by convolution of an impulse, which means [that], if the speakers can reproduce an impulse accurately, you can listen to the sound as it was recorded. This is a key detail in the SA-Z1's design."

The SA-Z1's two 4" woofers are placed back to back in a layout that mimics that of the traditional Japanese "tsuzumi" drum (and also any number of subwoofer and loudspeaker designs). Sony claims that the two woofers' vibrations cancel out when both are active. (On one setting, the rear woofers are deactivated; see below.) The layout is said to reduce enclosure resonance and deliver precise imaging. Bass disperses forward and through side vents located on either side of the rear-facing woofer.

The SA-Z1's aluminum enclosure is comprised of two different aluminum alloys; each of its six panels has a different thickness. Specially designed trapezoidal rubber dampers between panels prevent vibration transmission and reduce resonance. An aluminum bridge between the front-speaker section and the amplifier and digital processing circuitry at the speakers' rear, as well as a 5mm–thick steel plate, create a "frame beam wall" chassis intended to prevent vibration from reaching the electronics.

Doing the numbers
The choice of digital input determines the maximum PCM and DSD sampling rates that the SA-Z1 can accept. I used the most versatile input, USB, which can accept native DSD up to 22.4MHz (that's 8×DSD), DoP (DSD over PCM) up to 5.6MHz, and PCM up to 32/768. The Walkman/Xperia input accepts DSD native up to 11.2MHz, DoP up to 5.6MHz, and PCM up to 32/384, while the poor old optical input is limited to PCM up to 24/96.

The SA-Z1 offers several options for digital playback: automatic upsampling of "Red Book" PCM files to high-rate PCM (32/384 for files input by USB; note, however, that all data eventually ends up as PWM, aka DSD); resampling of PCM to high-rate DSD (DSD256 when input via USB; this is called DSD-RE, for DSD Remastering); straight playback of DSD files; or what Sony calls "optional enhancement of compressed music" via the company's "Digital Sounds Enhancement Engine." "DSEE-HX" attempts to restore what's been lost from compression with a combination of upsampling and what Sony calls "harmonic restoration."


Neither the DSEE-HX nor DSD-RE functions operate with the SA-Z1's analog inputs—which should convince you that it's best to use the system's digital inputs instead. Just try feeding high-resolution signal from your computer to a high-quality DAC—I used Mytek's Manhattan II—and connecting its RCA outputs to the SA-Z1's analog inputs. I did, and I found the sound rather flat and disappointing. "SA-Z1 is designed as a digital system," Carlsson told me. "As such, it definitely sounds best with digital inputs. It won't benefit from an external DAC like a purely analog system would; in fact, it will not be able to perform at its best. The analog inputs are provided for flexibility."


The two speakers are connected by a digital sync cable (supplied), which attaches to each speaker's rear, across from the power-cable inlet. Speaker A's front panel includes power (on/off), input selection, and DSEE-HX and DSD-RE buttons. There's also a volume control knob and an LED readout that displays input, volume level and muting state, bit/sample rate, and a few other things.

Sony Electronics Inc.
16535 Via Esprillo
San Diego, CA 92127

jimtavegia's picture

I wish they would get creative about another under $1K SACD player. "If you build it, they will come."

JRT's picture

You can extract the DSD layer from SACDs and store that data on an inexpensive network attached server (Kal knows how to do this).

teched58's picture

JVS is very excited at the performance of these $8K desktop speakers, although he's clearly puzzled that Sony doesn't use power conditioners in their labs. (Fortunately, Mr. Carlsson doesn't disabuse him of the benefits of power cords and cables.)

Meanwhile, from deep in the recesses of the audiophile equivalent of the Federal Reserve, JA1 provides his perspective in plain language that's accessible to longtime readers of his valuable measurements:


Sony has an impressive pedigree in conventional loudspeaker design (footnote 3), so I was intrigued to see how a specialized design like the SA-Z1 would perform in the test lab.

stereophileuser2020's picture

What's the significance of that quote?

tonykaz's picture

... not made in JAPAN ?

Who is the intended customer for these things ?

Did SONY see a pair of Devialet Phantoms and decide to compete by making something in Malaysia selling at 2 to 4 times the price with a two year warrantee but no beautifully designed packaging?

The SONY Brand is no longer 2D4, besides these things are ugly. ( doncha think ? )

Will our wonderful whistler be keeping these Demo's for extended evaluations ?

Tony in Venice Florida

ps. frequency range : 10hz to 200khz ---- 200,000 hz -- wow, it's almost RF. is this a Class D rig with large heatsinks ?

JRT's picture

Some radio transmission frequencies can also get pretty low. The ITU designates "ELF, extremely low frequency" band of radio frequencies as 3_Hz to 30_Hz. While the low bandwidth forces slow transmission of information, the extremely low frequencies are capable of penetrating ground and sea water.

tonykaz's picture

Ok, thanks, I might've been attempting a bit of exaggeration concerning the propensity of Japanese to exaggerate dubious performance properties.

It might seem appropriate for an $8,000 Asian desktop device to have a bit more bragging rights than a modest Genelec Active 820 system costing about $1,000.

Thanks for writing back with a bit of science related to wave propagation, it's nice to know that proper Scientists read Stereophile too!

Tony in Venice Florida

Jack L's picture


Good question! Who really wants to drop $8,000 for a pair of desktop loudspeakers !?

Only some very few rich & famous would want to put those not-so-handsome
looking loudspeakers on their desks assuming they were sound crazy like some of us !

There are far smarter & less costly way to enjoy music on one's workdesk. Why $8,000 made-in-Malaysia Sony? Where would Sony stand on the really Hi-end audio wagon anyway ?

Jack l

tonykaz's picture

Thanks for writing back,

Who is the intended ? Maybe these things are designed for some industrial build application like the ceiling of a Bus or the interior of a small Corporate Jet.

It's puzzling !

Tony in Venice Florida

Long-time listener's picture

For anyone who would like a deep soundstage on their desktop at a lower price, I can enthusiastically recommend the Buchardt Audio S300 speaker. Provided you can accommodate its larger size, it's ideal. It has a very even off-axis response, which gives you a wide (and deep) sweet spot as you move around while working at your desk. I find its sound comparable to the Dynaudio Special 40. The Special 40 is slightly better at resolving midrange textural detail, but the Buchardt has a more even response (especially off-axis), and sounds a little more open, with equally good bass weight. I have no connection with Buchardt Audio; I just really enjoy these speakers whenever I listen.

jimtavegia's picture

I have certainly enjoyed my pair of 305's for way less money for years now.

Ortofan's picture

... have something completely different?

Start with a pair of LS3/5a-esque near-field monitors.
Take your pick from Graham, Harbeth, Spendor, Stirling, etc.

Add a tube-type integrated amp - the PrimaLuna EVO 100, for example.

Complete the system with a DAC, such as the Chord Qutest or the Schiit Yggdrasil.

For a similar total cost, have JVS see (and hear) how such a combo might measure up - in a manner of speaking.

Charles E Flynn's picture

I hope to see a review by John Atkinson of the KEF KC62 subwoofer used with the LS50 Metas. It would be interesting to see how this combination handles the bass on the Seattle Symphony's Also Sprach Zarathustra.

I would also like to see Mr. Serinus review this recording:

Jason Victor Serinus's picture

not by me. By the time I asked about reviewing it, it had already been claimed. So I chose other goodies. Stay tuned...


Charles E Flynn's picture

Thanks for the preview of coming attractions.

Jack L's picture

......... Also Sprach Zarathustra."quoted C E Flynn

Why should I spend $3,000 for these 3 KEFs minis to play Also Sprach on my workbench ASSUMING these minis could make miracles on tabletop environment.

For serious music performances as such, I would definitely play them properly on my audio rig - no less ! We should show some respect to its composer: Richard Strass !

I've 4 or 5 LPs on the Also Aprach title & the best 2 out of those are the DGG label: Karajan conducting the Berlin Philharmonic. Only one of these two sounds to my satisfaction: starting with some clean deep deep low crawling bass notes followed immediately by the CLEAN & powerful beating of the kettle drums !!!

An acid test of transient response of any sound system: clean low low crawling bass notes, IMMEDIATElY followed by CLEAN & forceful kettle drums beating of Also Aprach Zarathustra on LP.

Listening is believing

Jack L

stereophileuser2020's picture

if they weren't so boxy. Sony's industrial design is usually as good as Apple's, but these speakers are not in that vein.

JRT's picture

RAAL Requisite SR1a ribbon headspeakers and Schiit Jotunheim R amplifier (the set, $4.0k from Moon) in combination with an Okto Research Dac8 Pro (sub-$1.2k direct) and a pair of SVS SB16 Ultra powered subwoofers ($2.0k/each, $3.8k/pair, direct), and the alternative system comes in at a little under $8k. Everything mentioned has been reviewed here at Stereophile in the not too distant past (oldest being the subwoofer back in 2017).

The use of headphones or headspeakers avoids the myriad of issues with interference from reflections from objects on and near the desk and nearby room boundaries, as well as interference from diffraction around edges of those objects.

The RAAL ribbon headspeakers are at or near the top of the headphone game, the weaker aspects being lack of isolation and limited low frequency output. The lack of isolation is a problem that is also shared with loudspeaker setups. The limited low frequency output can be solved with use of subwoofers. The subwoofers could also be utilized to augment a loudspeaker setup, when not using headspeakers/headphones.

SVS SB16 Ultra is a low distortion powered sub capable of high volume velocity on wide bandwidth (for a sub) from a sealed alignment exhibiting much lower group delay relative to bass reflex, and includes significantly capable DSP.

If applying separate upstream DSP to the RAAL headspeakers, the subs could receive a separately processed or unprocessed stream through the Okto 8 channel DA converter while the Okto also provides a master multichannel volume control.

That leaves four channels on the Okto which could be utilized to feed another pair of power amplifiers and loudspeakers plus another different headphone amplifier and headphones. Or the four channels could be used to integrate a pair of small satellite monitors (eg. KEF LS50 Meta) with a pair of bespoke woofer bins (maybe talk to Lee Taylor or Jim Salk about what they can build for you) either some side firing midwoofers located behind the computer monitor if used with desktop monitor loudspeakers, or some woofers in bass bins used as floor stands for your stand mounted monitors setup elsewhere in the room.

The Okto multichannel DA converter with multichannel volume control, especially in combination with upstream DSP and/or external AD converters, enables much system flexibility.

Jack L's picture

....... with interference from reflections...." quoted JRT

Maybe so in theory. I don't experience such "interference from reflection" in my 700sq ft basement audio den withOUT any elaborated acoustic treatment at all !

So yr acoustics 'science' can't blind my critical ears !

BUT, headphones or headspeakers FAIL to deliver the same live 3-D spartial environment offered by any floor/stand loudspeaker. You may not worry about it but many live concert goers, like yours truly. do !

That's why I personally do not go for any headphone music !

You know why? Hint: this is the nature of our ears BOTH sharing the same soundwaves !

Listening is believing

Jack L