
Senior Editors Viewpoint
Nibbling on the Audio Food Chain
by Dick Olsher
Click here to e-mail reviewer
The Rules of the Game:
Mono, Stereo, and Surround Sound
Just ask Gorky, my pet mini Godzilla, who like Yoda is constantly in tune with the
Force. He's particularly apt to wax poetic after polishing off his plutonium laced cookie
treats. He'll tell you in a heart beat that knowing the rules determines your chances of
success in whatever game you play. Most audiophiles struggle for years to arrive at but
never really reach the "it's live" plateau of sound reproduction. They simply
fail to understand the playing field they're operating in. Being wedded to a particular
audio format defines the ultimate level of sonic enjoyment possible. Therefore, my goal
this month is to examine the big picture: the past and present state of audio formats as
they pertain to music reproduction in the home.
Since home theater has redefined multi-channel audio, I'll overview the technology as
it pertains to home theater. But I'm really less interested in DTS or Dolby Digital
encoded music CDs and DVD-Audio as I am in the application of surround sound to enhance
the reproduction of standard stereo CDs and LPs. Digital music formats are presently in
state of flux. The Sony Super Audio Compact Disc (SACD) will be battling it out with
DVD-Audio in the marketplace for supremacy in the audiophile music format sweepstakes.
What isn't clear at the moment is whether the average consumer gives a hoot about super
audio anything. I'm not aware of public dissatisfaction with the sound quality of the
standard audio CD. In fact, analog cassette sales are doing well as are CD transfers to
cassette. Hence, I find it incredible to imagine a world where car audio and boom boxes
become DVD-Audio or SACD ready. And if that's the case, I seriously doubt the willingness
of the music industry to double stock titles in various formats.
Those consumers who have already invested in home theater with a fully operational
5.1-channel surround set up, may be tempted to nibble at the DTS 20-bit music CDs. There
are now over 100 titles available (check out the DTS store at www.dtsonline.com). However, most music lovers already
have extensive libraries of LPs and standard music CDs. The essential question for us
becomes: is it possible or even worthwhile to squeeze new life out of standard stereo CDs
and records? I'll be describing new possibilities of enhancing the listening experience by
recreating a more realistic soundfield at the listening seat. Audiophiles have spent
enormous sums of money in the hopes of recapturing the magic of live music in their homes.
Many of these investments have ended in frustration because no matter how much money you
might throw at two-channel audio, its potential is limited. And to compound the problems
faced by stereo in the home is the basic fact that typical speakers are designed to
measure well in an anechoic chamber and lack the proper power response and directionality
to perform well in a real-world listening room.
In the Beginning...
Man has been endowed with a marvelous auditory system. As I pound my computer's
keyboard, I am aware of and am able to localize birds chirping outside the window to my
left. The wind blows the curtains gently as the sound of Django Reinhardt's gypsy guitar
streams down the hall from behind me. That I can map out a three-dimensional soundfield is
no miracle. We are all blessed with spatial hearing by virtue of binaural hearing and
pretty sophisticated processing of the signals present at our ear drums. We instinctively
paint a picture of the external world from a complex stream of sounds processed inside our
heads. We can discern the distance and direction of auditory events and the type of
acoustic we're in. We can also discriminate timbre, and segregate specific events from a
mixture of sounds.
The latter attribute is quite impressive. How is it that even with my eyes blindfolded
I can tell, for example, that a violin and a viola are playing the same tune -- even if
both musicians are perfectly in sync and are physically centered in front of me? Since the
signal arriving at both ears is identical, it might seem quite logical that I should
homogenize the sound and perceive a single instrument whose timbre is a hybrid of the two
actual instruments. But that is not the case. I'm in fact able to segregate the sound into
two perceptual streams. Overtones from each instrument are grouped together on the basis
of harmonicity and commonality in vibrato and tremolo. This is far in advance of what
machines can do today. It took many years of research before voice recognition software
could even reliably parse out speech phonemes. Add the complexity of emotions and many
voices and you've just blown machines out of the water. Yet, we possess the power to
resolve not only individual voices in a chorus, but also subtle volume, pitch, and
rhythmic modulations that imprint a variety of emotions onto the human voice.
Mono as in Monophonic
All early recordings and playback systems were inherently monophonic. Imagine Enrico
Caruso hunched over an Edison acoustic horn recorder singing in top voice so as to scratch
out what is at best a noisy and compressed tin-foil analog track. Even with the advent of
electronic recording and radio, the order of the day was to capture a single clean track.
Reproduction in the home was via a single horn or radio loudspeaker. The speaker was
regarded simply as a sound producer. Its job was to approximate with reasonable fidelity
the timbre of various instruments. Mono was ideally suited to the needs of early radio
broadcasts because it was simple and robust. Radio sets in the 30s became the family
entertainment center. And since sound was piped into the living room, much like water from
a faucet, it was quite obvious that you were being visited by a happening from far away.
It was much like reading a good novel. The listener supplied a bit of imagination in order
to transport himself into another dimension.
The Advent of Stereo Sound
In the 1930s basic research at Bell Telephone Labs and by Blumlein in England laid the
foundation for stereo recording and playback. It was recognized that both time and
intensity differences could be used to record and recreate an approximate soundstage --
with at least a convincing left-to-right spread -- using only two speakers. It took
another 20 years before the technology to record and multiplex two audio channels onto a
vinyl groove was ready for prime time. In the late 50s stereo recordings finally became
widely available and by the late 60s stereo routed mono in the marketplace.
Stereo seemed to creep into every facet of audio. The advent of FM stereo and TV stereo
sound helped to forge a technological icon so pervasive that the word itself became
synonymous with an audio system. These days, if you're shopping for a sound system, try
walking into your local electronics emporium and ask for a monophonic system. Good luck.
The sales clerk initial puzzlement and confusion will turn into a confident smile when you
explain that you're actually after a stereo. Stereo is now ubiquitous. Even boom boxes,
rack systems, and systems intended for book shelf mounting are outfitted with a pair of
speakers for stereo sound reproduction. The public has long forgotten, and the industry
doesn't bother to explain, that stereo is only effective and only makes sense when the set
up is optimal. If the speakers are only minimally spaced apart, as with built-in speakers
on most TV sets, the stereo effect is lost. What you have in effect is glorified mono.
Because the left and right channels are essentially coincident, the sensation of
soundstage width is lost. I also believe that for book shelf mounting or in situations
where the listening seat is far off axis, mono is sonically preferable to stereo.
Unfortunately, very few receivers and preamps still feature a mono switch, so you can't
feed both speakers the same signal.
Is there Life after Stereo?
Are there fresh new vistas beyond two-channel audio? Is it possible to glimpse an audio
future more exciting and more involving than that of plain-vanilla stereo? The answer is
an emphatic yes! In general, surround sound technology deploys additional audio channels
(e.g., rear and/or side channels) to enhance the realism of the soundfield at the
listening seat. The basic idea is to immerse the listener in a soundfield that produces a
much more believable ambient signature than is possible from two-channel audio. With
proper implementation, such technology has the potential to create a virtual sonic reality
in the middle of your living room. This exciting prospect has fired the imagination of
many audio designers for the past 30 years.
It is important first to understand stereo sound's limitations. Audiophile magazines in
general have had very little to say about binaural and surround sound, having turned a
deaf ear not only to home theater, but also to surround sound in the context of audio-only
music reproduction systems. The implication is that stereo is nearly perfect; perfection
being apparently just another expensive amp or exotic cable away. This impression is
strongly reinforced by speaker reviews that are peppered with superlatives along the lines
of "billowing" or "cavernous" soundstage, "holographic"
imaging, and a bunch of assorted trumpet calls about a reviewer being "blown
away" or leaving a "wet spot" on the listening room couch. There's of
course a kernel of truth in all of this hyperbole. With some speakers, located in the
right spot in a compatible room, two-channel audio can weave considerable magic. However,
most of the time the illusion of being there is not particularly strong.
The basic problem with stereo is that all of the audio information is presented from in
front of the listener. Contrast this with the soundfield in a concert hall where
the listener is bathed with reflected energy from all directions. We now know from the
work of Barron and Marshall that lateral reflections delayed by at least 10 milliseconds
are crucial to forming a desirable spatial impression in the concert hall. These
reflections are responsible for adding body and fullness to the music. This effect is very
different from reverberation, which tends to provide a sense of envelopment in the sound
and an impression of distance from the source. Barron in his 1971 paper in the Journal of
Sound and Vibration relates the description of spatial impression given by the manager of
the Concertgebouw Orchestra of Amsterdam, who described it as the difference between
feeling inside the music and looking at it, as through a window. The basic feature of the
window analogy is the sensation of being on the outside of the original acoustic.
Standard stereo's frontal presentation has very little chance of generating the right
time signature of reflected energy in a listening room. As a consequence, stereo's
ultimate promise is a clean window on the sound. Many of us baby boomers who matured
during the golden age of stereo are obviously entrenched in the status quo and are
reluctant to consider change. This is understandable. Neither am I suggesting that stereo
is dead or obsolete. If surround sound is too threatening, the listening room too small to
accommodate additional speakers, or you just don't want to invest in additional
electronics and speakers, then stereo clearly remains a viable alternative.
However, none other than J. Gordon Holt, founder of Stereophile Magazine and
audio guru extraordinaire is literally no longer a stereophile. He has converted his audio
system several years ago to surround sound. Gordon uses two pairs of identical Tannoy
10-inch coaxial speaker in a standard quadraphonic arrangement. The front pair is fed the
usual stereo left and right channels, while the rear speakers reproduce ambient
information derived from a Lexicon DC-1 digital controller/processor (check it out at
www.lexicon.com) with a suitable time delay. The Lexicon derives a left minus right
difference signal which represents the recording's incoherent or out of phase information.
This by definition is hall ambience; reflected energy that has visited many surfaces and
is no longer correlated with the direct sound. This is not a new idea. David Hafler's
shuffler circuit (AKA as the poor man's quadraphonic or QUAD circuit) performed the same
function (still available and cheaper than ever! See Lynn Olson's article in
this issue on this web page). However,
it lacked a provision for time delay -- a crucial element in obtaining the correct time
signature for hall ambience.
When Gordon's system is properly tweaked for front-rear balance and with a suitable
rear-channel delay, it is impossible to localize the rear channels. One is simply aware of
heightened spaciousness and depth perspective. Gordon listens almost exclusively to
classical music, and because he regularly records the Boulder (Colorado) Symphony
Orchestra in concert, his collection of remarkable master tapes is considerable. I should
state for the record that these are two-channel stereo recordings using a purist recording
technique. Consider also that Gordon's room is quite ordinary as living rooms go. Nothing
fancy here. The only absorptive materials present, as I recall, were drapes, carpeting,
and an old sofa. So I settled down for a listen with Lucy the cat in tow. Absolutely great
stuff me thinks and I wonder silently how Gordon is getting such great sound with all that
solid-state-gear, Tannoy speakers and ordinary room acoustics. So I ask him to turn off
the rear channels, to hear the difference between surround and straight stereo, and I just
about fall out of my chair. It was as if somebody punctured a balloon: all of the air went
out of the soundstage. Depth, musical textures, and image size all took a big hit. Instead
of being able to focus on the musical message I found myself analyzing the sound's many
failings. Holy Cow! From enticing to miserable in one giant step. So I beg Gordon to
restore the rear channels, and he obliges me with a big grin. One essential moral from
this story is that equipment seems to matter less in surround sound. I personally would
have trashed Gordon's entire set up were I had to listen in stereo, whereas in surround, I
was content to enjoy the music.
Binaural Sound

This is a good
time to mention the headphone listening experience and binaural recordings. Such
recordings attempt to capture and then duplicate the sound signals generated at the ear
canals in the original acoustic by placing microphone capsules in the "ears" of
an artificial or dummy head. The following figure shows Bill and Fred, a pair of MIT
students, attending to Keith -- the MIT version of a dummy head -- inside an anechoic
chamber. Well, no dummy head exactly replicates your pinna or the shape of your particular
ear canal. But the simulation is sufficiently realistic for the dummy head to act as a
surrogate listener. The microphone feeds from each of the ears become the stereo left and
right channels. When fed to a pair of headphones, such recordings get can quite accurately
simulate the sensation of being there. I vividly remember a binaural demo CD distributed
by Stax many years ago. Helga opens a door stage left, walks up to me and proceeds to
whisper something sweet in my right ear. Quite convincing, but note that binaural
recordings can't accurately localize front to back information and, of course, when you
move your head around the entire sonic universe rotates with you -- unlike what actually
happens in the real world.
Note also that binaural recordings don't work well on conventional loudspeakers. The
problem is that there's crosstalk between channels. The left and right speakers are heard
by both the left and right ears. For example, the left ear not only picks up the left
channel but also a copy of the right channel. This destroys the original set of spatial
clues and the sensation of being located in the original recording space. The field of
transaural audio is concerned with creating the impression of surround sound using an
ordinary stereo playback system. During recording, the left and right signals are
processed to reduce crosstalk and imprint a set of spatial clues to, for example,
synchronize sound movement with the action sequence of a computer or video game. The one
demo I've heard of transaural sound was pretty effective --as long as I kept my head
centered at just the right spot between the speakers. As soon as I moved my head a bit,
the sonic presentation collapsed to that of ordinary stereo.
Early Surround Attempts
The 70s, saw the introduction of the infamous quadraphonics or QUAD sound system. Two
rear speakers were added to the standard stereo front channel complement and for the first
time "steering" of sound became possible so as to actually surround the listener
with sound. Lack of time delay, poor channel separation, and problems with the
needle-in-the groove software killed QUAD for music playback. The results certainly did
not justify the added expense and inconvenience. But the story didn't end there. Dolby
Labs, the same folks who revolutionized tape recording in the late 60s and early 70s with
Dolby A and Dolby B noise reduction, re-engineered quad for the movie theater with Dolby
Stereo.
Dolby Stereo and Dolby Surround
Dolby Stereo revolutionized the film industry in 1976 when Star Wars the movie
blasted off the screen. Multi-channel sound was placed onto affordable film prints with
optical soundtracks. Dolby Stereo encodes four audio channels on the two optical tracks
available on standard 35mm film. These channels are decoded as left, center, right, and
surround channels to correspond to the basic speaker layout in a theater. The main
speakers in a theater are typically mounted on scaffolding behind the screen, while the
surround speakers are arrayed on the sides and back of the auditorium. Dialogue is
recovered by the decoder as a mono signal and is fed to the center speaker, while ambient
information is suitably delayed and fed to the surround speakers. The purpose of the delay
is to ensure that frontal sound reaches the listener before the surround sound does.
A version of Dolby Stereo was introduced by Dolby Labs in 1982 for the home market and
dubbed Dolby Surround to differentiate it from the pro Dolby Stereo cinema system. It
remains the most popular surround sound encoding for consumer applications. Thousands of
theatrical films on home video, many TV shows, audio cassettes, and CDs are Dolby Surround
encoded.
A home theater system with Dolby Surround can take many forms. However, the minimum
hardware outlay is a decoder/amplifier, two front speakers, and one rear speaker, though
typically two rear speakers are used. You'll also need a HiFi VCR to playback
Dolby-Surround encoded video tapes. Instead of a providing a discrete center channel feed,
these early decoders split the center channel equally between the left and right channels
to create a phantom center image. That's fine if you're watching a movie alone and are
centered between the front speakers, but if there's a crowd, listeners off axis will tend
to localize the dialogue in the speaker they're nearest to. The surround channel is
delayed on the order of 15 to 20 millisecond to blend it more realistically with the front
channels. Unfortunately, the surround channel is mono and of limited audio bandwidth. In
terms of sound quality, it approximates that of an AM table radio (100Hz to 7kHz).
Therefore, it makes little sense to pour too much money into the rear channels of an
analog Dolby Surround setup. Instead, put most of the bucks into high-quality left and
right front speakers.
The art improved somewhat in 1987 when Dolby Labs unveiled its second generation
decoder -- the Pro Logic. Pro Logic decoders derive a separate center channel to keep
dialogue and other central sounds firmly localized on the video screen for all listeners.
The center channel also carries a significant share of other on-screen sounds, special
effects, and music. Therefore, it is recommended that the center speaker and its
amplification be matched with the right and left speakers to prevent timbre and tonality
changes as sounds move from one channel to the other. Pro Logic also supplies higher
separation among all four channels and more accurate sound positioning, which along with
the center channel provide a greatly expanded listening area. As an added convenience
built-in test signal generator and level adjustments are provided to balance the four
channels.
Digital Surround Sound
First introduced to moviegoers in 1992 with Batman Returns, Dolby Digital
ushered in surround sound's digital era. Whereas Dolby Stereo works off the two analog
optical audio tracks, Dolby Digital provides six discrete audio channels encoded as a
digital track directly on the film in the space between the sprocket holes. In this manner
Dolby Digital coexists with the analog tracks without involving any other media such as a
CD, making it simple for theater owners to handle.
The current format for digital discrete surround is known as the "5.1
channel" system. There are five channels of full-bandwidth audio (20 Hz to 20 kHz):
left, center, right, left surround, and right surround. The sixth channel, also known as
the .1 channel or subwoofer channel, covers a narrow bandwidth (3 Hz to 120 Hz) and may
contain deep bass to enhance the impact of explosions, crashes, and the like. Taken
together, such a sound system is said to have 5.1 channels.
To make all of this technically feasible, Dolby developed a new form of digital audio
coding or compression, often known as perceptual coding. The idea is to allow the use of
lower data rates to save on storage space with only a minimum subjective degradation of
sound quality. Dolby's third generation audio coding algorithm (AC-3) is used during the
encoding process to give about a 10:1 compression ratio, allowing all of the 5.1 audio
channels to fit in the space taken up by one standard audio CD track. That means that 90%
of the sampled data is thrown away and only 10% is retained. The reason such coding
schemes work at all is that at any moment only a small percentage of the audio information
can be perceived. For example, loud sounds mask soft sounds, so if one removed the masked
sounds entirely you wouldn't in theory miss a thing. Data compression is also used in
video, with the MPEG-2 standard accepted universally for digital video delivery systems
via cable, satellite, terrestrial broadcasting, and even digital video disc.
On the movie theater front, the digital 5.1 surround format takes three forms: Dolby
Stereo Digital (DSD), DTS System from Digital Theater Systems, and Sony Dynamic Digital
Sound (SDDS). With more than 21,000 digitally equipped screens worldwide and almost 1,900
Dolby Digital films released or announced, Dolby Digital has to be acknowledged as the
leader in digital surround. However, DTS is closing the gap with over 8,000 theaters
around the world. [Jurassic Park] was the first film encoded with DTS. Since then well
over 100 movies have been DTS encoded. A major advantage of DTS (and also SDDS) over DSD
is that it uses far less compression, using a 5:1 rather than a 10:1 perceptual coding
scheme. A unique feature of DTS is that the audio track is not on the film itself but is
played off CD ROM. A time code is put onto the film between the optical soundtrack and
film frame which ensures that the correct sound is played for each frame of film
projected.
I have heard these digital surround systems in various theaters, and it seems to me
that the weakest link in the sound system continues to be the quality of the front
speakers. While I enjoy the surround effects, I invariably complain about treble sizzle
and muddy bass. My latest exposure to a SDDS THX sound system was rather painful. The fact
that the movie in question ("Event Horizon") was a dud didn't help, but the
lower treble was so hot, that we had to beg the management to at least turn down the
volume.
What About THX?
The THX sound system originated in 1982 during the production of Lucasfilm's
"Return of the Jedi". Since the system was developed by Tomlinson Holman, it was
dubbed the Tomlinson Holman eXperiment or THX for short. It is first and foremost a set of performance
specifications designed to ensure that film sound tracks are reproduced as they were
originally created by the filmmakers. The Home THX system is designed to reproduce
surround sound accurately in the home. The driving force behind the creation of Home THX
was the observation that conventional audio components could not accurately reproduce film
soundtracks in the home. Two basic problems were identified.
First, a need to correct the tonal and spatial errors caused by the playback of
soundtracks designed in and for large theaters in the much smaller home listening
environment. Second, the need to more accurately reproduce the complex soundfield of
multi-channel surround sound. The bottom line is that Home THX is compatible with both
Dolby Digital and DTS and offers the technology to reproduce such movie soundtracks
accurately.
A Home THX system may incorporate one or more THX certified components such as
speakers, amps, a room equalizer, or an acoustically transparent projection screen. There
is also a Home THX Laser Disc player that performs to the highest audio and video
standards. The heart of the system, however, is the Home THX Controller which includes the
multi-channel circuitry and various electronic equalization and processing circuits.
Rather than recommend five matched speakers plus a subwoofer, the Home THX standard
calls for the use of dipole speakers for the surround channels. That makes a lot of sense,
because dipoles (speakers that radiate front and back) are able to recreate a more diffuse
soundfield than conventional box speakers -- especially when surround placement in the
home is limited to a couple of speakers close to the listener. Surround cues and pans are
less likely to collapse into the speaker box with dipoles. I've also had good success
using omnidirectional speakers for the surround channels, again for the same reasons.
The Future
For home theater applications, the 5.1 format's two discrete digital surround tracks
are a major improvement over the original Dolby Surround's single mono surround track. But
four discrete digital surrounds would be much better yet. The ability to truly envelope
the listener in a soundfield is a function of the number of available channels. Virtual
reality, according to a German study, requires at least 18 channels to correctly reproduce
the directionality of the diffuse field. Of course, that amounts to an impractical data
stream during encoding and decoding and also a horrendous mess of speakers and amps for
playback. A more modest and practical leap forward would be represented by a 8.1 format,
which should certainly be feasible say 10 years down the line. Such a new format would
increment the realism of surround sound by at least a factor of four. The two additional
surround channels would most likely feed left and right side speakers and allow more
realistic pans and flyovers from front to back. As usual, the practical disadvantage of
the new scheme would be the requirement for two more speakers and two more amplifier
channels.
In the context of a music-only home system, I expect to see renewed interest in
surround sound methodology as a means to expand on the soundfield limitations of ordinary
stereo. A well-known audio writer (who shall remain nameless for now) has stated that
until the front two-channels reach perfection, which is to say that that stereo achieves
perfection, he will remain uninterested in surround sound. This is absolutely dumb and
dumber! In my opinion, stereo can never reach perfection because of its inherent
limitations in properly fleshing out a soundfield. In truth, stereo is not the ideal
format for music reproduction in the home. As a music lover, I'm more interested in
enjoying the music than defending entrenched audiophile dogma. So I plan to take a cue to
J. Gordon Holt and experiment with surround sound. My own research over the past two years
in the area of Ceiling Boundary Ambience Enhancement has convinced me of the importance of
treating the speaker as a soundfield reproducer. I for one am ready to transcend ordinary
stereo.