Blind Speaker Testing
at NRC
Shortly, I'll be enjoying a kind of homecoming. For almost
25 years, I was a regular participant in the speaker testing program run by the National
Research Council in Ottawa, but I haven't been back since NRC-based magazine reviews
ceased more than five years ago. Now I'll have an opportunity to exercise my ears again,
for the benefit of readers of MastersonAudio.com.
I've detailed the history of the program, developed by Dr.
Floyd Toole. Now it might be useful to describe the tests themselves, at least as they
pertained to data intended for magazine publishing. They consisted of formal measurements
in the anechoic chamber, controlled blind-listening tests, and a final all-is-revealed
comparison of the two.
The anechoic measurements included things like a speaker's
impedance curve, its total harmonic distortion, and a "directivity index" that
shows something of the speaker's dispersion characteristics. The main curves were of
frequency response, on axis and at various points off-axis, because most of the major
differences between speakers are spectral.
Originally, a number of separate curves were drawn, but in
later years computer-averaged curves (taken at many more points) indicated the direct
sound, the near-field reflections, and the balance of the sound radiated into the
listening room's reverberant field. A total radiated power curve was produced as well. The
closer all these curves were in shape, the smoother the overall sound would be.
Such measurements were produced first, usually, but the
results withheld until after the listening tests. These took place in a specially modified
listening room, the acoustics of which are typical of a normal living room.
At the back of the room, the electronics were operated by a
technician; in the center were three or four numbered comfortable chairs, depending on the
number of listeners participating; at the front was an acoustically transparent but
visually opaque screen, behind which were placed the speakers under test.

A PSB Image loudspeaker in the NRC's anechoic chamber.

The NRC's blind-listening room. At the front of the photo are
two listening chairs with the calibration microphone placed between. At the rear is the
acoustically transparent screen placed in front of the speakers so the identity of each
speaker under evaluation is concealed.
|
Listening took place in a number of rounds,
each about half an hour long. The number was dictated by the necessity for listeners to
hear each speaker in the batch (plus a few extras) against each other. Also, because the
speakers had to be in different positions, they had to be auditioned in each position.
Listeners changed chairs as well.
Says Toole, who maintains a similar program at Harman
International in California: "We try to ensure by repeated tests that speakers are
heard in a number of typical locations; the listener positions are known and loudspeaker
locations are known, so that when different listeners offer opinions, we're sure that they
have heard the same sounds."
A series of musical selections were used for the test,
chosen for their ability to reveal sonic differences. "Some pieces of music are very
revealing of these differences while others are not," says Toole. For the magazine
tests, pink noise was added at the end of the music program.
"At the very least it's a blind test," says
Toole. "The basic principle is to allow the listeners to focus as much as possible on
the sound itself and the differences in the sounds they are listening to, and to be
prejudiced as little as possible by other factors.
"We did some blind-versus-sighted tests," he
adds, "and they showed that when you saw the product that you were listening to, that
fact changed the ratings more than the sound. The hard core of us believe that, but there
are still a lot of people out there who remain unconvinced."
That psychological reaction has been observed informally
for years, and in 1994 Toole and colleague Sean Olive put it to the test. They performed a
series of blind listening tests in which the listeners didn't know what speakers they were
hearing, and then repeated them exactly but with the speakers visible. In the conclusion
of the paper they presented to the Audio Engineering Society, they said "when
listeners knew what they were listening to, the opinions were dictated more by the product
identity than by the sound. . . . That an effect of this kind should be observed is not
remarkable, nor is it unexpected. What is surprising is that the effect is so strong, and
that it applies about equally to experienced and inexperienced listeners."
A maximum of four speakers was included in each round, with
levels equalized so that the familiar effect of louder speakers seeming better couldn't
occur. An illuminated display at the front of the room indicated by number which speaker
was playing, as the technician switched among them. In the beginning, the listeners did
the switching but, as Toole points out, "With multiple listeners, it's hard enough to
focus on the differences in the sound without having to take care of the switching too.
Also we found that some listeners just didn't know how to switch effectively."
During each round, each listener had to fill out a form
that rated various aspects of performance for each speaker: such things as clarity,
brightness, distortions and so forth. In addition, the speakers had to be rated on a 1-10
scale for pleasantness and fidelity, and there was an area for descriptive comments on how
each speaker handled the various musical selections. Early on, some of the more
experienced listeners also began sketching a rough frequency response curve on the forms,
and this was formally incorporated in the final version.
At NRC grew a body of listeners who had developed
considerable expertise over the years. Floyd Toole points out, "inevitably we always
have a certain percentage of new listeners, but we always have a hard core of
well-practiced listeners because they give us the most rapid and most accurate scores.
Their scores do not necessarily differ from those listeners who are unpracticed, but they
get the answers faster." Over the years, the results of the anechoic measurements and
the listening tests have tended to correlate extremely well.
...Ian G. Masters
ian@mastersonaudio.com
|