Hearing Distortion
An audio signal travels a very
tortuous path on its way to our living rooms. What a piano string or a voice does -- or
indeed any source of sound -- is set up a series of compressions and rarefactions of air
in its immediate surroundings. The pattern of these over time is the "waveform"
of the sound, and it can be as simple as a sine wave or as complicated as a symphony
orchestra going full tilt.
When it comes to turning that sound into audio and back
again, a microphone converts the variations in air pressure to an analogous electrical
signal which may then be increased or decreased in magnitude, stored as a series of pits
on a CD or used to modulate a high-frequency carrier and sent over the air. At the end of
the chain, an electrodynamic device reconverts the varying electrical signal back to
air-pressure differences, and it finally reaches our eardrums. At every step along the
way, the waveform of an ideal audio signal should be exactly the same as the original; any
variation, however caused, is called "distortion."
Most of the major effort of audio designers over much of
the past half-century or so has been aimed at reducing the amount of such distortion that
creeps into a high-fidelity signal; that they have done their work well is evident in any
set of specs, where distortion figures of hundredths or even thousandths of a percent are
common. But no one has yet come up with an audio component that adds nothing of its
own -- distortion is present at every stage, and it is cumulative throughout the audio
chain.
But how much does it matter? The quest for ever-lower
distortion levels is a theoretical end in itself, of course, and there will probably never
be a time when designers will be entirely satisfied, but there is a point below which we
can no longer hear distortion, even though it may be measurable in the lab. What we don't
know for sure, however, is the threshold below which a given sort of distortion is no
longer audible.
Virtually everything that affects an audio signal as it
passes through the system is a form of distortion, although some forms are known by more
specialized names. Two main categories exist: non-linear and linear distortion.
The non-linear variety is what usually carries the name
distortion. Most familiar, perhaps, is "harmonic" distortion, where the
equipment generates spurious signals related to the wanted signal harmonically. Since most
music is rich in harmonics, this sort of distortion is often masked by the program
material, unless it is high enough to change the character of the instruments themselves.
Also contributing new sounds is intermodulation distortion (IM), which generates
frequencies at the numerical sum and difference of two sounds in the program material.
Because these are unlikely to be harmonically related to the music, they are less likely
to be masked.
In linear distortion, nothing is added, but the
relationships between different parts of the signal are altered in some fashion. The most
prevalent of these is rarely termed distortion; when an audio component deals with some
frequencies more efficiently than others -- emphasizes them -- we normally talk of
frequency response irregularities (or non-linearities) rather than distortion. But it is a
form of distortion nonetheless.
By the same token, some types of audio equipment tend to
delay certain frequencies with respect to others. This is known as "phase shift"
or "group delay," and is most often a physical attribute of speakers, although
it has always been a measurable factor in the performance of filters, crossover networks,
equalizers, and the like.
To try to find out what we can and can't hear, David Clark
of DLC Designs in Michigan conducted a series of listening tests some years ago. As
listening subjects, Clark called on the worthy members of the Southwest Michigan Woofer
and Tweeter Marching Society -- SWMWTMS (pronounced "smootums") -- one of the
United States' most active and enthusiastic audio clubs. I participated as an observer.
Clark decided to use one type of nonlinear distortion --
total harmonic distortion (or "grunge") -- and two forms of linear distortion:
phase shifting at several levels of severity and a frequency-response peak in the
sensitive midrange.
For each of the three major sections of the test, a test
signal was used, both because it would be easier to detect unmasked distortion, and
because it would educate the listeners as to what they should be listening for. In the
case of harmonic distortion, a 220Hz sine wave was chosen; for frequency response, pink
noise was used; and for phase shift a repeating pulse was selected.
Similarly, for each type of distortion, an
"optimal" short piece of music was selected that had proved itself to be
revealing of the particular problem involved. In practice, only short bits of all these
were auditioned, rather than whole selections, as only parts of them were suitable for the
test. Finally, for all three sections of the test, a single "natural" selection
was auditioned, to represent the average run of music that one might listen to or use to
demonstrate an audio system. This was chosen specifically because it didn't show up
any of the kinds of distortion under investigation particularly, and therefore would
suggest the levels that might be audible in normal listening.
For each type of distortion, and for each source, some 20
tests were performed. First, the "natural" selection was played at a fairly
gross level of distortion, and listeners could compare that with the undistorted version.
Then this was repeated with the distortion level reduced; four levels were used in all.
The whole process was then repeated using the "optimal" selection, and then with
the test signal.
A standard listening test device -- the ABX comparator --
was used throughout, in which listeners had to decide whether or not what they were
listening to at any moment was the "straight" signal or the distorted one. The
purpose was to determine whether the listeners really heard distortion at the various
levels, or simply thought they did. If the number of correct answers in a given test were
less than half of the total, for instance, it could be concluded that the subjects were
merely guessing, and were hearing no real differences. If there were more than 75 percent
correct answers in one test, one could conclude with reasonable certainty that the
subjects were hearing differences. Between the two, there is less certainty, but
scores in the lower part of that range are insufficient to lead to a sure conclusion that
the listeners were hearing anything.
The first test was for harmonic distortion, or
"grunge." Four levels of THD were used, beginning with 8%, then 4%, 2%, and 1%.
Using the natural music, virtually all listeners could hear 8%, but at 1%, the scores were
very close to pure chance, suggesting that this is the minimum distortion level audible
with this musical selection. Repeating the test with the optimal music showed that the
grunge was much easier to hear, at least at the higher levels, the subjects showing
virtual certainty down to 2%. But the 1% results, while slightly higher than with the
natural music, still do not indicate that this level was audible.
Similar results were evident when frequency-response
irregularities were tested. For this, a broad peak was added, centered on 3kHz, at +5dB,
+2dB, +1dB and +0.5dB. With optimal music played as low as +1dB, listeners were quite
certain, only dropping below the chance level at the +0.5 peak level. Using the test
signal (pink noise) the response peak was audible even at the lowest level.
Phase shift turned out to be the most difficult form of
distortion to hear. Four amounts of phase shift were used. The greatest was some 2700°;
if this were to be encountered in a speaker, the woofer would have to be seven feet
behind the rest of the speaker. This was progressively reduced to 2150°, 1620° and
1080°. None of these would ever be encountered in a normal audio component. The order of
tests had to be altered in this case, with the test signal being auditioned first, just so
the subjects would know what to listen for.
Once educated, however, the majority of listeners had
little difficulty hearing phase shift at all levels, as long as the test signal was used.
Only with the least amount of shift did responses drop to the 73%-correct level, but this
amount was probably still audible. With the optimal music, which was not dissimilar to the
test pulse, but was masked to some extent by reverberation on the disc, the scores
dropped, only reaching some certainty at 1600°; above and below this amount, the scores
were close to chance. With the "natural" music, even the greatest amount was not
audible.
Listening sessions of this sort are very difficult for the
subjects, as they require a new set of aural skills, as well as specialized materials. The
members of SWMWTMS who volunteered to take part in this series of experiments found the
process fatiguing, partly because they had to strain to hear anything, and partly because
they had to learn as they went along. There does seem to be some evidence that sensitivity
to some types of distortion can be learned; one listener in this group, who has had more
than passing contact with phase shift, scored almost perfectly in some rounds where the
other subjects were straining to hear anything. The chances of an average listener being
sensitized this way are remote, however, so the likelihood of his being bothered by the
sorts of distortion in this investigation is small.
The amount of distortion used in the tests was much greater
than would normally be encountered in any audio equipment that a serious buyer might
consider. Even so, it only became audible to our listeners -- when it was audible at all
-- by using test signals and short musical passages specially chosen to reveal the
particular type of distortion. When "normal" music was substituted, the
audibility vanished at all but the highest distortion levels.
Other music, other types of distortion, other listeners, or
other equipment might yield different results. But the carefully set up conditions of this
series of tests were designed to yield a level of audibility much lower than would be
encountered in real-life situations. Without these conditions and without the learning
process that the subjects went through, it is very improbable that any of us need be
troubled even by the amounts of distortion used in the tests, let alone those we normally
encounter.
...Ian G. Masters
ian@mastersonaudio.com
|