MASTERS ON AUDIO AND VIDEOFeatures Archives

April 1, 2004

 

Hearing Distortion

An audio signal travels a very tortuous path on its way to our living rooms. What a piano string or a voice does -- or indeed any source of sound -- is set up a series of compressions and rarefactions of air in its immediate surroundings. The pattern of these over time is the "waveform" of the sound, and it can be as simple as a sine wave or as complicated as a symphony orchestra going full tilt.

When it comes to turning that sound into audio and back again, a microphone converts the variations in air pressure to an analogous electrical signal which may then be increased or decreased in magnitude, stored as a series of pits on a CD or used to modulate a high-frequency carrier and sent over the air. At the end of the chain, an electrodynamic device reconverts the varying electrical signal back to air-pressure differences, and it finally reaches our eardrums. At every step along the way, the waveform of an ideal audio signal should be exactly the same as the original; any variation, however caused, is called "distortion."

Most of the major effort of audio designers over much of the past half-century or so has been aimed at reducing the amount of such distortion that creeps into a high-fidelity signal; that they have done their work well is evident in any set of specs, where distortion figures of hundredths or even thousandths of a percent are common. But no one has yet come up with an audio component that adds nothing of its own -- distortion is present at every stage, and it is cumulative throughout the audio chain.

But how much does it matter? The quest for ever-lower distortion levels is a theoretical end in itself, of course, and there will probably never be a time when designers will be entirely satisfied, but there is a point below which we can no longer hear distortion, even though it may be measurable in the lab. What we don't know for sure, however, is the threshold below which a given sort of distortion is no longer audible.

Virtually everything that affects an audio signal as it passes through the system is a form of distortion, although some forms are known by more specialized names. Two main categories exist: non-linear and linear distortion.

The non-linear variety is what usually carries the name distortion. Most familiar, perhaps, is "harmonic" distortion, where the equipment generates spurious signals related to the wanted signal harmonically. Since most music is rich in harmonics, this sort of distortion is often masked by the program material, unless it is high enough to change the character of the instruments themselves. Also contributing new sounds is intermodulation distortion (IM), which generates frequencies at the numerical sum and difference of two sounds in the program material. Because these are unlikely to be harmonically related to the music, they are less likely to be masked.

In linear distortion, nothing is added, but the relationships between different parts of the signal are altered in some fashion. The most prevalent of these is rarely termed distortion; when an audio component deals with some frequencies more efficiently than others -- emphasizes them -- we normally talk of frequency response irregularities (or non-linearities) rather than distortion. But it is a form of distortion nonetheless.

By the same token, some types of audio equipment tend to delay certain frequencies with respect to others. This is known as "phase shift" or "group delay," and is most often a physical attribute of speakers, although it has always been a measurable factor in the performance of filters, crossover networks, equalizers, and the like.

To try to find out what we can and can't hear, David Clark of DLC Designs in Michigan conducted a series of listening tests some years ago. As listening subjects, Clark called on the worthy members of the Southwest Michigan Woofer and Tweeter Marching Society -- SWMWTMS (pronounced "smootums") -- one of the United States' most active and enthusiastic audio clubs. I participated as an observer.

Clark decided to use one type of nonlinear distortion -- total harmonic distortion (or "grunge") -- and two forms of linear distortion: phase shifting at several levels of severity and a frequency-response peak in the sensitive midrange.

For each of the three major sections of the test, a test signal was used, both because it would be easier to detect unmasked distortion, and because it would educate the listeners as to what they should be listening for. In the case of harmonic distortion, a 220Hz sine wave was chosen; for frequency response, pink noise was used; and for phase shift a repeating pulse was selected.

Similarly, for each type of distortion, an "optimal" short piece of music was selected that had proved itself to be revealing of the particular problem involved. In practice, only short bits of all these were auditioned, rather than whole selections, as only parts of them were suitable for the test. Finally, for all three sections of the test, a single "natural" selection was auditioned, to represent the average run of music that one might listen to or use to demonstrate an audio system. This was chosen specifically because it didn't show up any of the kinds of distortion under investigation particularly, and therefore would suggest the levels that might be audible in normal listening.

For each type of distortion, and for each source, some 20 tests were performed. First, the "natural" selection was played at a fairly gross level of distortion, and listeners could compare that with the undistorted version. Then this was repeated with the distortion level reduced; four levels were used in all. The whole process was then repeated using the "optimal" selection, and then with the test signal.

A standard listening test device -- the ABX comparator -- was used throughout, in which listeners had to decide whether or not what they were listening to at any moment was the "straight" signal or the distorted one. The purpose was to determine whether the listeners really heard distortion at the various levels, or simply thought they did. If the number of correct answers in a given test were less than half of the total, for instance, it could be concluded that the subjects were merely guessing, and were hearing no real differences. If there were more than 75 percent correct answers in one test, one could conclude with reasonable certainty that the subjects were hearing differences. Between the two, there is less certainty, but scores in the lower part of that range are insufficient to lead to a sure conclusion that the listeners were hearing anything.

The first test was for harmonic distortion, or "grunge." Four levels of THD were used, beginning with 8%, then 4%, 2%, and 1%. Using the natural music, virtually all listeners could hear 8%, but at 1%, the scores were very close to pure chance, suggesting that this is the minimum distortion level audible with this musical selection. Repeating the test with the optimal music showed that the grunge was much easier to hear, at least at the higher levels, the subjects showing virtual certainty down to 2%. But the 1% results, while slightly higher than with the natural music, still do not indicate that this level was audible.

Similar results were evident when frequency-response irregularities were tested. For this, a broad peak was added, centered on 3kHz, at +5dB, +2dB, +1dB and +0.5dB. With optimal music played as low as +1dB, listeners were quite certain, only dropping below the chance level at the +0.5 peak level. Using the test signal (pink noise) the response peak was audible even at the lowest level.

Phase shift turned out to be the most difficult form of distortion to hear. Four amounts of phase shift were used. The greatest was some 2700°; if this were to be encountered in a speaker, the woofer would have to be seven feet behind the rest of the speaker. This was progressively reduced to 2150°, 1620° and 1080°. None of these would ever be encountered in a normal audio component. The order of tests had to be altered in this case, with the test signal being auditioned first, just so the subjects would know what to listen for.

Once educated, however, the majority of listeners had little difficulty hearing phase shift at all levels, as long as the test signal was used. Only with the least amount of shift did responses drop to the 73%-correct level, but this amount was probably still audible. With the optimal music, which was not dissimilar to the test pulse, but was masked to some extent by reverberation on the disc, the scores dropped, only reaching some certainty at 1600°; above and below this amount, the scores were close to chance. With the "natural" music, even the greatest amount was not audible.

Listening sessions of this sort are very difficult for the subjects, as they require a new set of aural skills, as well as specialized materials. The members of SWMWTMS who volunteered to take part in this series of experiments found the process fatiguing, partly because they had to strain to hear anything, and partly because they had to learn as they went along. There does seem to be some evidence that sensitivity to some types of distortion can be learned; one listener in this group, who has had more than passing contact with phase shift, scored almost perfectly in some rounds where the other subjects were straining to hear anything. The chances of an average listener being sensitized this way are remote, however, so the likelihood of his being bothered by the sorts of distortion in this investigation is small.

The amount of distortion used in the tests was much greater than would normally be encountered in any audio equipment that a serious buyer might consider. Even so, it only became audible to our listeners -- when it was audible at all -- by using test signals and short musical passages specially chosen to reveal the particular type of distortion. When "normal" music was substituted, the audibility vanished at all but the highest distortion levels.

Other music, other types of distortion, other listeners, or other equipment might yield different results. But the carefully set up conditions of this series of tests were designed to yield a level of audibility much lower than would be encountered in real-life situations. Without these conditions and without the learning process that the subjects went through, it is very improbable that any of us need be troubled even by the amounts of distortion used in the tests, let alone those we normally encounter.

...Ian G. Masters
ian@mastersonaudio.com


MASTERS ON AUDIO AND VIDEOAll Contents Copyright © 2004
Schneider Publishing Inc., All Rights Reserved.
Any reproduction of content on
this site without permission is strictly forbidden.