Just ask Gorky, my pet mini Godzilla, who like Yoda is constantly in tune with the Force. He's particularly apt to wax poetic after polishing off his plutonium laced cookie treats. He'll tell you in a heart beat that knowing the rules determines your chances of success in whatever game you play. Most audiophiles struggle for years to arrive at but never really reach the "it's live" plateau of sound reproduction. They simply fail to understand the playing field they're operating in. Being wedded to a particular audio format defines the ultimate level of sonic enjoyment possible. Therefore, my goal this month is to examine the big picture: the past and present state of audio formats as they pertain to music reproduction in the home.
Since home theater has redefined multi-channel audio, I'll overview the technology as it pertains to home theater. But I'm really less interested in DTS or Dolby Digital encoded music CDs and DVD-Audio as I am in the application of surround sound to enhance the reproduction of standard stereo CDs and LPs. Digital music formats are presently in state of flux. The Sony Super Audio Compact Disc (SACD) will be battling it out with DVD-Audio in the marketplace for supremacy in the audiophile music format sweepstakes. What isn't clear at the moment is whether the average consumer gives a hoot about super audio anything. I'm not aware of public dissatisfaction with the sound quality of the standard audio CD. In fact, analog cassette sales are doing well as are CD transfers to cassette. Hence, I find it incredible to imagine a world where car audio and boom boxes become DVD-Audio or SACD ready. And if that's the case, I seriously doubt the willingness of the music industry to double stock titles in various formats.
Those consumers who have already invested in home theater with a fully operational 5.1-channel surround set up, may be tempted to nibble at the DTS 20-bit music CDs. There are now over 100 titles available (check out the DTS store at www.dtsonline.com). However, most music lovers already have extensive libraries of LPs and standard music CDs. The essential question for us becomes: is it possible or even worthwhile to squeeze new life out of standard stereo CDs and records? I'll be describing new possibilities of enhancing the listening experience by recreating a more realistic soundfield at the listening seat. Audiophiles have spent enormous sums of money in the hopes of recapturing the magic of live music in their homes. Many of these investments have ended in frustration because no matter how much money you might throw at two-channel audio, its potential is limited. And to compound the problems faced by stereo in the home is the basic fact that typical speakers are designed to measure well in an anechoic chamber and lack the proper power response and directionality to perform well in a real-world listening room.
In the Beginning...
Man has been endowed with a marvelous auditory system. As I pound my computer's keyboard, I am aware of and am able to localize birds chirping outside the window to my left. The wind blows the curtains gently as the sound of Django Reinhardt's gypsy guitar streams down the hall from behind me. That I can map out a three-dimensional soundfield is no miracle. We are all blessed with spatial hearing by virtue of binaural hearing and pretty sophisticated processing of the signals present at our ear drums. We instinctively paint a picture of the external world from a complex stream of sounds processed inside our heads. We can discern the distance and direction of auditory events and the type of acoustic we're in. We can also discriminate timbre, and segregate specific events from a mixture of sounds.
The latter attribute is quite impressive. How is it that even with my eyes blindfolded I can tell, for example, that a violin and a viola are playing the same tune -- even if both musicians are perfectly in sync and are physically centered in front of me? Since the signal arriving at both ears is identical, it might seem quite logical that I should homogenize the sound and perceive a single instrument whose timbre is a hybrid of the two actual instruments. But that is not the case. I'm in fact able to segregate the sound into two perceptual streams. Overtones from each instrument are grouped together on the basis of harmonicity and commonality in vibrato and tremolo. This is far in advance of what machines can do today. It took many years of research before voice recognition software could even reliably parse out speech phonemes. Add the complexity of emotions and many voices and you've just blown machines out of the water. Yet, we possess the power to resolve not only individual voices in a chorus, but also subtle volume, pitch, and rhythmic modulations that imprint a variety of emotions onto the human voice.
Mono as in Monophonic
All early recordings and playback systems were inherently monophonic. Imagine Enrico Caruso hunched over an Edison acoustic horn recorder singing in top voice so as to scratch out what is at best a noisy and compressed tin-foil analog track. Even with the advent of electronic recording and radio, the order of the day was to capture a single clean track. Reproduction in the home was via a single horn or radio loudspeaker. The speaker was regarded simply as a sound producer. Its job was to approximate with reasonable fidelity the timbre of various instruments. Mono was ideally suited to the needs of early radio broadcasts because it was simple and robust. Radio sets in the 30s became the family entertainment center. And since sound was piped into the living room, much like water from a faucet, it was quite obvious that you were being visited by a happening from far away. It was much like reading a good novel. The listener supplied a bit of imagination in order to transport himself into another dimension.
The Advent of Stereo Sound
In the 1930s basic research at Bell Telephone Labs and by Blumlein in England laid the foundation for stereo recording and playback. It was recognized that both time and intensity differences could be used to record and recreate an approximate soundstage -- with at least a convincing left-to-right spread -- using only two speakers. It took another 20 years before the technology to record and multiplex two audio channels onto a vinyl groove was ready for prime time. In the late 50s stereo recordings finally became widely available and by the late 60s stereo routed mono in the marketplace.
Stereo seemed to creep into every facet of audio. The advent of FM stereo and TV stereo sound helped to forge a technological icon so pervasive that the word itself became synonymous with an audio system. These days, if you're shopping for a sound system, try walking into your local electronics emporium and ask for a monophonic system. Good luck. The sales clerk initial puzzlement and confusion will turn into a confident smile when you explain that you're actually after a stereo. Stereo is now ubiquitous. Even boom boxes, rack systems, and systems intended for book shelf mounting are outfitted with a pair of speakers for stereo sound reproduction. The public has long forgotten, and the industry doesn't bother to explain, that stereo is only effective and only makes sense when the set up is optimal. If the speakers are only minimally spaced apart, as with built-in speakers on most TV sets, the stereo effect is lost. What you have in effect is glorified mono. Because the left and right channels are essentially coincident, the sensation of soundstage width is lost. I also believe that for book shelf mounting or in situations where the listening seat is far off axis, mono is sonically preferable to stereo. Unfortunately, very few receivers and preamps still feature a mono switch, so you can't feed both speakers the same signal.
Is there Life after Stereo?
Are there fresh new vistas beyond two-channel audio? Is it possible to glimpse an audio future more exciting and more involving than that of plain-vanilla stereo? The answer is an emphatic yes! In general, surround sound technology deploys additional audio channels (e.g., rear and/or side channels) to enhance the realism of the soundfield at the listening seat. The basic idea is to immerse the listener in a soundfield that produces a much more believable ambient signature than is possible from two-channel audio. With proper implementation, such technology has the potential to create a virtual sonic reality in the middle of your living room. This exciting prospect has fired the imagination of many audio designers for the past 30 years.
It is important first to understand stereo sound's limitations. Audiophile magazines in general have had very little to say about binaural and surround sound, having turned a deaf ear not only to home theater, but also to surround sound in the context of audio-only music reproduction systems. The implication is that stereo is nearly perfect; perfection being apparently just another expensive amp or exotic cable away. This impression is strongly reinforced by speaker reviews that are peppered with superlatives along the lines of "billowing" or "cavernous" soundstage, "holographic" imaging, and a bunch of assorted trumpet calls about a reviewer being "blown away" or leaving a "wet spot" on the listening room couch. There's of course a kernel of truth in all of this hyperbole. With some speakers, located in the right spot in a compatible room, two-channel audio can weave considerable magic. However, most of the time the illusion of being there is not particularly strong.
The basic problem with stereo is that all of the audio information is presented from in front of the listener. Contrast this with the soundfield in a concert hall where the listener is bathed with reflected energy from all directions. We now know from the work of Barron and Marshall that lateral reflections delayed by at least 10 milliseconds are crucial to forming a desirable spatial impression in the concert hall. These reflections are responsible for adding body and fullness to the music. This effect is very different from reverberation, which tends to provide a sense of envelopment in the sound and an impression of distance from the source. Barron in his 1971 paper in the Journal of Sound and Vibration relates the description of spatial impression given by the manager of the Concertgebouw Orchestra of Amsterdam, who described it as the difference between feeling inside the music and looking at it, as through a window. The basic feature of the window analogy is the sensation of being on the outside of the original acoustic.
Standard stereo's frontal presentation has very little chance of generating the right time signature of reflected energy in a listening room. As a consequence, stereo's ultimate promise is a clean window on the sound. Many of us baby boomers who matured during the golden age of stereo are obviously entrenched in the status quo and are reluctant to consider change. This is understandable. Neither am I suggesting that stereo is dead or obsolete. If surround sound is too threatening, the listening room too small to accommodate additional speakers, or you just don't want to invest in additional electronics and speakers, then stereo clearly remains a viable alternative.
However, none other than J. Gordon Holt, founder of Stereophile Magazine and audio guru extraordinaire is literally no longer a stereophile. He has converted his audio system several years ago to surround sound. Gordon uses two pairs of identical Tannoy 10-inch coaxial speaker in a standard quadraphonic arrangement. The front pair is fed the usual stereo left and right channels, while the rear speakers reproduce ambient information derived from a Lexicon DC-1 digital controller/processor (check it out at www.lexicon.com) with a suitable time delay. The Lexicon derives a left minus right difference signal which represents the recording's incoherent or out of phase information. This by definition is hall ambience; reflected energy that has visited many surfaces and is no longer correlated with the direct sound. This is not a new idea. David Hafler's shuffler circuit (AKA as the poor man's quadraphonic or QUAD circuit) performed the same function (still available and cheaper than ever! See Lynn Olson's article in this issue on this web page). However, it lacked a provision for time delay -- a crucial element in obtaining the correct time signature for hall ambience.
When Gordon's system is properly tweaked for front-rear balance and with a suitable rear-channel delay, it is impossible to localize the rear channels. One is simply aware of heightened spaciousness and depth perspective. Gordon listens almost exclusively to classical music, and because he regularly records the Boulder (Colorado) Symphony Orchestra in concert, his collection of remarkable master tapes is considerable. I should state for the record that these are two-channel stereo recordings using a purist recording technique. Consider also that Gordon's room is quite ordinary as living rooms go. Nothing fancy here. The only absorptive materials present, as I recall, were drapes, carpeting, and an old sofa. So I settled down for a listen with Lucy the cat in tow. Absolutely great stuff me thinks and I wonder silently how Gordon is getting such great sound with all that solid-state-gear, Tannoy speakers and ordinary room acoustics. So I ask him to turn off the rear channels, to hear the difference between surround and straight stereo, and I just about fall out of my chair. It was as if somebody punctured a balloon: all of the air went out of the soundstage. Depth, musical textures, and image size all took a big hit. Instead of being able to focus on the musical message I found myself analyzing the sound's many failings. Holy Cow! From enticing to miserable in one giant step. So I beg Gordon to restore the rear channels, and he obliges me with a big grin. One essential moral from this story is that equipment seems to matter less in surround sound. I personally would have trashed Gordon's entire set up were I had to listen in stereo, whereas in surround, I was content to enjoy the music.
This is a good time to mention the headphone listening experience and binaural recordings. Such recordings attempt to capture and then duplicate the sound signals generated at the ear canals in the original acoustic by placing microphone capsules in the "ears" of an artificial or dummy head. The following figure shows Bill and Fred, a pair of MIT students, attending to Keith -- the MIT version of a dummy head -- inside an anechoic chamber. Well, no dummy head exactly replicates your pinna or the shape of your particular ear canal. But the simulation is sufficiently realistic for the dummy head to act as a surrogate listener. The microphone feeds from each of the ears become the stereo left and right channels. When fed to a pair of headphones, such recordings get can quite accurately simulate the sensation of being there. I vividly remember a binaural demo CD distributed by Stax many years ago. Helga opens a door stage left, walks up to me and proceeds to whisper something sweet in my right ear. Quite convincing, but note that binaural recordings can't accurately localize front to back information and, of course, when you move your head around the entire sonic universe rotates with you -- unlike what actually happens in the real world.
Note also that binaural recordings don't work well on conventional loudspeakers. The problem is that there's crosstalk between channels. The left and right speakers are heard by both the left and right ears. For example, the left ear not only picks up the left channel but also a copy of the right channel. This destroys the original set of spatial clues and the sensation of being located in the original recording space. The field of transaural audio is concerned with creating the impression of surround sound using an ordinary stereo playback system. During recording, the left and right signals are processed to reduce crosstalk and imprint a set of spatial clues to, for example, synchronize sound movement with the action sequence of a computer or video game. The one demo I've heard of transaural sound was pretty effective --as long as I kept my head centered at just the right spot between the speakers. As soon as I moved my head a bit, the sonic presentation collapsed to that of ordinary stereo.
Early Surround Attempts
The 70s, saw the introduction of the infamous quadraphonics or QUAD sound system. Two rear speakers were added to the standard stereo front channel complement and for the first time "steering" of sound became possible so as to actually surround the listener with sound. Lack of time delay, poor channel separation, and problems with the needle-in-the groove software killed QUAD for music playback. The results certainly did not justify the added expense and inconvenience. But the story didn't end there. Dolby Labs, the same folks who revolutionized tape recording in the late 60s and early 70s with Dolby A and Dolby B noise reduction, re-engineered quad for the movie theater with Dolby Stereo.
Dolby Stereo and Dolby Surround
Dolby Stereo revolutionized the film industry in 1976 when Star Wars the movie blasted off the screen. Multi-channel sound was placed onto affordable film prints with optical soundtracks. Dolby Stereo encodes four audio channels on the two optical tracks available on standard 35mm film. These channels are decoded as left, center, right, and surround channels to correspond to the basic speaker layout in a theater. The main speakers in a theater are typically mounted on scaffolding behind the screen, while the surround speakers are arrayed on the sides and back of the auditorium. Dialogue is recovered by the decoder as a mono signal and is fed to the center speaker, while ambient information is suitably delayed and fed to the surround speakers. The purpose of the delay is to ensure that frontal sound reaches the listener before the surround sound does.
A version of Dolby Stereo was introduced by Dolby Labs in 1982 for the home market and dubbed Dolby Surround to differentiate it from the pro Dolby Stereo cinema system. It remains the most popular surround sound encoding for consumer applications. Thousands of theatrical films on home video, many TV shows, audio cassettes, and CDs are Dolby Surround encoded.
A home theater system with Dolby Surround can take many forms. However, the minimum hardware outlay is a decoder/amplifier, two front speakers, and one rear speaker, though typically two rear speakers are used. You'll also need a HiFi VCR to playback Dolby-Surround encoded video tapes. Instead of a providing a discrete center channel feed, these early decoders split the center channel equally between the left and right channels to create a phantom center image. That's fine if you're watching a movie alone and are centered between the front speakers, but if there's a crowd, listeners off axis will tend to localize the dialogue in the speaker they're nearest to. The surround channel is delayed on the order of 15 to 20 millisecond to blend it more realistically with the front channels. Unfortunately, the surround channel is mono and of limited audio bandwidth. In terms of sound quality, it approximates that of an AM table radio (100Hz to 7kHz). Therefore, it makes little sense to pour too much money into the rear channels of an analog Dolby Surround setup. Instead, put most of the bucks into high-quality left and right front speakers.
The art improved somewhat in 1987 when Dolby Labs unveiled its second generation decoder -- the Pro Logic. Pro Logic decoders derive a separate center channel to keep dialogue and other central sounds firmly localized on the video screen for all listeners. The center channel also carries a significant share of other on-screen sounds, special effects, and music. Therefore, it is recommended that the center speaker and its amplification be matched with the right and left speakers to prevent timbre and tonality changes as sounds move from one channel to the other. Pro Logic also supplies higher separation among all four channels and more accurate sound positioning, which along with the center channel provide a greatly expanded listening area. As an added convenience built-in test signal generator and level adjustments are provided to balance the four channels.
Digital Surround Sound
First introduced to moviegoers in 1992 with Batman Returns, Dolby Digital ushered in surround sound's digital era. Whereas Dolby Stereo works off the two analog optical audio tracks, Dolby Digital provides six discrete audio channels encoded as a digital track directly on the film in the space between the sprocket holes. In this manner Dolby Digital coexists with the analog tracks without involving any other media such as a CD, making it simple for theater owners to handle.
The current format for digital discrete surround is known as the "5.1 channel" system. There are five channels of full-bandwidth audio (20 Hz to 20 kHz): left, center, right, left surround, and right surround. The sixth channel, also known as the .1 channel or subwoofer channel, covers a narrow bandwidth (3 Hz to 120 Hz) and may contain deep bass to enhance the impact of explosions, crashes, and the like. Taken together, such a sound system is said to have 5.1 channels.
To make all of this technically feasible, Dolby developed a new form of digital audio coding or compression, often known as perceptual coding. The idea is to allow the use of lower data rates to save on storage space with only a minimum subjective degradation of sound quality. Dolby's third generation audio coding algorithm (AC-3) is used during the encoding process to give about a 10:1 compression ratio, allowing all of the 5.1 audio channels to fit in the space taken up by one standard audio CD track. That means that 90% of the sampled data is thrown away and only 10% is retained. The reason such coding schemes work at all is that at any moment only a small percentage of the audio information can be perceived. For example, loud sounds mask soft sounds, so if one removed the masked sounds entirely you wouldn't in theory miss a thing. Data compression is also used in video, with the MPEG-2 standard accepted universally for digital video delivery systems via cable, satellite, terrestrial broadcasting, and even digital video disc.
On the movie theater front, the digital 5.1 surround format takes three forms: Dolby Stereo Digital (DSD), DTS System from Digital Theater Systems, and Sony Dynamic Digital Sound (SDDS). With more than 21,000 digitally equipped screens worldwide and almost 1,900 Dolby Digital films released or announced, Dolby Digital has to be acknowledged as the leader in digital surround. However, DTS is closing the gap with over 8,000 theaters around the world. [Jurassic Park] was the first film encoded with DTS. Since then well over 100 movies have been DTS encoded. A major advantage of DTS (and also SDDS) over DSD is that it uses far less compression, using a 5:1 rather than a 10:1 perceptual coding scheme. A unique feature of DTS is that the audio track is not on the film itself but is played off CD ROM. A time code is put onto the film between the optical soundtrack and film frame which ensures that the correct sound is played for each frame of film projected.
I have heard these digital surround systems in various theaters, and it seems to me that the weakest link in the sound system continues to be the quality of the front speakers. While I enjoy the surround effects, I invariably complain about treble sizzle and muddy bass. My latest exposure to a SDDS THX sound system was rather painful. The fact that the movie in question ("Event Horizon") was a dud didn't help, but the lower treble was so hot, that we had to beg the management to at least turn down the volume.
What About THX?
The THX sound system originated in 1982 during the production of Lucasfilm's "Return of the Jedi". Since the system was developed by Tomlinson Holman, it was dubbed the Tomlinson Holman eXperiment or THX for short. It is first and foremost a set of performance specifications designed to ensure that film sound tracks are reproduced as they were originally created by the filmmakers. The Home THX system is designed to reproduce surround sound accurately in the home. The driving force behind the creation of Home THX was the observation that conventional audio components could not accurately reproduce film soundtracks in the home. Two basic problems were identified.
First, a need to correct the tonal and spatial errors caused by the playback of soundtracks designed in and for large theaters in the much smaller home listening environment. Second, the need to more accurately reproduce the complex soundfield of multi-channel surround sound. The bottom line is that Home THX is compatible with both Dolby Digital and DTS and offers the technology to reproduce such movie soundtracks accurately.
A Home THX system may incorporate one or more THX certified components such as speakers, amps, a room equalizer, or an acoustically transparent projection screen. There is also a Home THX Laser Disc player that performs to the highest audio and video standards. The heart of the system, however, is the Home THX Controller which includes the multi-channel circuitry and various electronic equalization and processing circuits.
Rather than recommend five matched speakers plus a subwoofer, the Home THX standard calls for the use of dipole speakers for the surround channels. That makes a lot of sense, because dipoles (speakers that radiate front and back) are able to recreate a more diffuse soundfield than conventional box speakers -- especially when surround placement in the home is limited to a couple of speakers close to the listener. Surround cues and pans are less likely to collapse into the speaker box with dipoles. I've also had good success using omnidirectional speakers for the surround channels, again for the same reasons.
For home theater applications, the 5.1 format's two discrete digital surround tracks are a major improvement over the original Dolby Surround's single mono surround track. But four discrete digital surrounds would be much better yet. The ability to truly envelope the listener in a soundfield is a function of the number of available channels. Virtual reality, according to a German study, requires at least 18 channels to correctly reproduce the directionality of the diffuse field. Of course, that amounts to an impractical data stream during encoding and decoding and also a horrendous mess of speakers and amps for playback. A more modest and practical leap forward would be represented by a 8.1 format, which should certainly be feasible say 10 years down the line. Such a new format would increment the realism of surround sound by at least a factor of four. The two additional surround channels would most likely feed left and right side speakers and allow more realistic pans and flyovers from front to back. As usual, the practical disadvantage of the new scheme would be the requirement for two more speakers and two more amplifier channels.
In the context of a music-only home system, I expect to see renewed interest in surround sound methodology as a means to expand on the soundfield limitations of ordinary stereo. A well-known audio writer (who shall remain nameless for now) has stated that until the front two-channels reach perfection, which is to say that that stereo achieves perfection, he will remain uninterested in surround sound. This is absolutely dumb and dumber! In my opinion, stereo can never reach perfection because of its inherent limitations in properly fleshing out a soundfield. In truth, stereo is not the ideal format for music reproduction in the home. As a music lover, I'm more interested in enjoying the music than defending entrenched audiophile dogma. So I plan to take a cue to J. Gordon Holt and experiment with surround sound. My own research over the past two years in the area of Ceiling Boundary Ambience Enhancement has convinced me of the importance of treating the speaker as a soundfield reproducer. I for one am ready to transcend ordinary stereo.