Brad Meyer began with a transparency detailing the steps through which a piece of classical music reaches the ears of the audiophile. First of course is the composer. Next come the players (and their instruments) and the conductor, who realize the composer's ideas in acoustic form. The sound goes out through the air into a hall, and is changed by microphones into an electrical signal. These electrical signals go through a mixer, and thence into a master recorder. Then comes the editing, and perhaps processing, reverb, and remixing. The result is transferred to some consumer music carrier LP once upon a time, today analog cassette or CD. Next comes the home player, followed by connecting wires, preamp and a power amp, speaker wires and loudspeakers, where the signal is transformed back into sound. Finally there is the listening room. Mark Fishman noted, to laughter, that Meyer had left out the influence of the power plant and the ac lines.
Meyer drew a box around the links in the chain over which the audiophile has control: the sequence from the music carrier through the listening room. Within this box lies the subject matter for consumer audio publications.
Tonight, however, Meyer was going to focus his attention on the very end of the chain: the listener. Many things affect how a listener perceives the sound. Among them are hearing limits, experience, fatigue, mood (one's own and that of others), and pharmacological substances both medical and recreational. Ambient lighting affects mood and is thus a factor, as Meyer discovered early on in his audio pursuits. Lighting just the speakers and leaving the rest of the room dark makes the sound more vivid and dramatic. Meyer suggested that people try this, and also darkening the whole room.
Listener hearing acuity of course is a major factor. Meyer mentioned that one now can get tested out to 20 kHz instead of just the standard 8 kHz [see the May 1991 meeting summary, in v19/1 PSH]. Meyer had had his ears' hearing thresholds (not the same thing as frequency response) measured and found that he has measurable loss in low-level detection at 12 kHz and a lot more above that. He found it a sobering experience, as have many of us.
Meyer pointed out that the ear's equal-loudness curves tend to bunch at the frequency extremes. This means that once the highest and lowest sounds are above the hearing threshold, a small change in level will sound louder than a similar change in the mid-band. This became painfully obvious during his high-frequency hearing tests. For example, at 18 kHz Meyer's threshold is 106 dB spl. At 104 dB he cannot hear it at all and yet at 106 dB; he yanked the phones from his head. Fishman quoted Bob Berkovitz as saying that if the sound is not audible it does not damage the ear even if its level is quite high.
Returning to the playback chain, Meyer went on to say that typically the audiophile can affect only a small part of it the playback system and the listening room (both acoustically and how it may be made to influence mood). There is no control over the recording process, although Meyer suggested that those who have the opportunity to do live recording really should try itit is dismaying how much influence microphone choice and placement have on the recorded sound.
Meyer speculated that so much attention has been paid by audiophiles to trivial aspects of the playback chain such as the cables and ac power because the advent of the CD has eliminated the audible distortion introduced in the process of getting the signals from the master tape to the playback preamp, hitherto an area ripe for great fussiness. Things are a lot less interesting now for those looking for controllable detail.
Mark Fishman brought up an interesting comment from J. Gordon Holt on memory: Holt now has better memory than hearing. His memory now hampers his enjoyment of many musical performances because he misses the sheen of the violin and the delicacy of the cymbals and triangle, which he remembers but no longer hears. The discrepancy bothers him. David Moran suggested that Holt might find it helpful to employ wider-dispersion tweeters and, theoretically, some judicious equalization, to get more audible treble into the reverberant field.
Alvin Foster reported a more cheerful result, saying that his own memory helps add sheen to the strings rather than detracting from his current listening enjoyment. Dan Banquer commented that, as a musician, he has always felt that nothing is like being in the middle of the music. No matter how many millions of dollars of equipment one has, it cannot recreate the experience of performing. Meyer added that he has a BSO violinist friend who complains that the BSO broadcasts do not have enough string sound. Meyer asked him how often he has listened to the BSO from out in the audience. [This again poses the question of what "viewpoint" the sound should be created for and/or played back from PSH.]
The ABX Comparator
The ABX comparator switches between two sources. The box has three buttons on the remote and three LEDs on the front panel, labeled A, B, and X (hence the product name). There is another pair of buttons, labeled Down and Up, which change the numeric display on the unit. When the box is powered on, it generates 100 random assignments of X to either A or B, one for each possible displayed number on a two-digit readout (00 to 99). A Reset button on the main control unit returns the sequence to test number 01. Pushing A connects source A to the output, and likewise for button B. Pushing X connects the box-selected source, which is either A or B. Neither the operator of the box nor the listeners have any notion of which source is X until the answers are read out at the end of the test. This kind of test is called double-blind, as neither the tester nor the tested knows the answers.
During the test the subjects (or the tester) switch among A, B, and X and then mark on an answer sheet whether X is A or B. The test is repeated for a series of separate trials. At the end of a series, pushing the Answer button reveals the identities of X for all trials. In the answer mode, X is on together with the selected source if X were A for trial number 01, for example, the LEDs for X and A will both be lit.
The ABX box is designed to determine how reliably the listener can detect differences. Preconceptions affect perception and conclusions [in other words, not only is seeing believing, but believing is also seeing-Ed.], hence the need for single blindness. Double-blind testing is required because the tester almost invariably (and unpredictably) influences the test subject(s). One of many well-known examples occurred when a group of psychology students tested many subjects for IQ. The subjects were impartially tested for IQ beforehand, and then sorted into two groups with similar IQ ranges. The testers were told that group A was exceptionally intelligent while group B was not. For each group, the testers were to read the same script while administering the test. The result was that the group touted as smart to the test-givers scored statistically significantly better than the group labeled stupid. Somehow the testers conveyed their expectations about performance while reading the same instructions to the two groups, and the groups responded to the cues.
Meyer handed out a sheet photocopied from the ABX manual which showed typical level-matching required for reliable detection of differences between sources with 1/3 octave frequency-response aberrations. When the aberrations span a wider spectrum, level-matching becomes increasingly critical, dropping to less than 1/3 of a dB especially in the ear-sensitive 2-5kHz region. Acuity (ability to hear difference) also depends sometimes on how close to the threshold of hearing the level of the frequency is. At threshold, a small increase in level will make the sound audible and enable the listener reliably to distinguish A and B when different.
Steve Owades noted that the use of the ABX box does not reduce bias in results due to peer pressure when the box is used with more than one listener at a time. Visible or audible reactions from surrounding listeners may influence a subject's answer. Such bias makes the answers dependent what one listener chooses is influenced by what his or her peers choose. This may invalidate the result for statistical analysis, which requires that the trials be independent.
Next Meyer inserted a Technics SH-9010 parametric equalizer in the B loop and set the 3 kHz slider for a 3 dB boost. The Q knob was set to 0.7 (the broadest setting, for a bandwidth of about two octaves). Playing pink noise through the system makes this alteration easy to hear, and the group got a score of 18/18 without difficulty. With choral music, whose broad frequency range makes it a good test for response aberrations, the score was 16/17.
The next test was much tougher: The 9010 was left in the circuit, but with all sliders set to their midpoints. Unlike some consumer equalizers, the semi-pro Technics has controls that really do what they say (boost, cut, or stay flat), and the response is quite flat in this condition except for a slight droop in the top octave. To make things more difficult, we heard only the choral music for this trial. The group got 7/17 correct.
The last two trials were bypass tests of the Sony PCM-F1 digital processor. The F1's video output was looped back to the input and the processor was set to a gain of 1.0 and connected to input B. The signal source was an LP made by Meyer and Peter Mitchell of organist James Johnson the same production whose digital version has been excerpted on the first and second Stereophile test CDs. The LP was made from an analog master, so we really were comparing an analog source directly with an F1-digitized version. The results on the two trials were 9/15 and 7/15; the total was 16/30, 53% correct.
Depending on the numbers of trials, there is a definite number of correct answers beyond which one can say that the probability of a listener's getting that number by chance is less than five percent. This is what is known as a 95% confidence level. Assuming independence, with six trials one has to get all six correct to satisfy this criterion. With 24 trials, 17 correct answers is the threshold. The percent of correct answers needed to qualify for `reliably hearing differences' decreases as the number of independent trials increases.
Stereophile carried out a double-blind test and then examined the results of only those subjects who got high scores. They concluded that this group had demonstrated the ability to hear differences. This, however, is statistically invalid: even for randomly generated answers, in a large group 1 out of 20 subjects would be expected to satisfy the 95% criterion by chance alone. (This group represents the 5% that you're 95% confident that a given subject doesn't fall into.) To ascertain whether there really is a golden-eared group, they should have selected the high scorers and used them for another series of trials.
The tests we took showed clear audibility to a confidence level well over 95% for the first three tests, and null results for the last three. The tests were conducted patiently and fairly, under generally good conditions; for example, there was a minimum of cross-comment.
Meyer noted that people typically get touchy, even grouchy, when two blind-compared pieces of equipment are very similar. It must be noted here that some high-end reviewers have said long-term listening to each piece of equipment produces more-reliable answers than short-period ABX switching. What they feel is that quick switching is less revealing than long-term listening to each piece of equipment although there is good evidence that, to the contrary, quick comparison increases acuity. In any case, contrary to popular misconception, there is no law against leaving the ABX box in position A for a month, then switching to B the next month, and finally to X during a third month.
Enjoy the Music.com highly encourages our readers to join the Boston Audio Society by clicking here).
This article is copyrighted © by the author or the Boston Audio Society.