MASTERS ON AUDIO AND VIDEOFeatures Archives

October 1, 2004

 

Measuring Audio -- Part Three

This month, we present the conclusion of a roundtable discussion that I conducted about 30 years ago with Dr. Floyd Toole and Errol J. Byers on the challenges of measuring audio equipment. These two experts were responsible for the actual testing that lay behind the audio evaluation program of AudioScene Canada magazine, of which I was one of the editors. For the background to the discussion, please see "Part One" in August.

We ended "Part Two" with some insights into distortion and its ramifications.

Ian Masters: This brings up the question of the reliability of the test equipment itself.

Floyd Toole: Yes, of course. But in the case of the electronic portion of the test apparatus, we do have ways of checking accuracy and performance. At least it is a more tractable set of measurements than the testing of the phono cartridges themselves. We can insert whatever test signals we like, at will and in our own time. We don't have to rely on unknown sources for the test signals.

IM: But is there a way to check whether a given piece of test equipment that claims to have, say, a distortion level of 0.05%, does in fact have such a level.

Errol Byers: When making distortion measurements in amplifiers, you use a low-distortion oscillator. To guarantee that it's low enough to make the measurement, you have to measure its own distortion, and there's obviously a limit to how far down you can measure.

Probably spectrum analyzers and other test equipment that you would use to check this sort of distortion would be resolved to 80 or 90dB below the fundamental. This represents 0.01% or 0.003% distortion.

IM: So it's not too important if there are inaccuracies below such levels.

EB: No. General practice is to use test instruments that are ten times better than what you are measuring. So if you determine that your oscillator has 0.03% distortion, then your limit of measurement should be 0.3%.

Some little variation in the 0.03% doesn't bother you particularly. The accuracy of the measurement check needn't be all that good, as long as you can guarantee that it is well below what you consider to be your bottom limit.

FT: This factor of ten is unusual, really. Not everybody adheres to this rather generous safety margin.

EB: That's true.

IM: To continue our discussion of unmade measurements, let's talk about FM tuners. Do you find that the lack of up-to-date standards is a problem?

EB: The main problem is that a lot of the measurement standards that are used, such as IHF sensitivity, deal with mono FM only, while in this day and age, the vast majority of transmissions are in stereo. In many tuners and receivers, there is a great difference between the monophonic sensitivity and the sensitivity you get with the stereo signal. People are just beginning to do some measurements in this area, but there are no standards that I know of.

IM: So the problem is not that the measurements are difficult, but rather in knowing what to measure.

EB: The procedures should be virtually identical to the monophonic ones, except that you are dealing with two channels.

To measure usable sensitivity, for example, you would apply an RF signal, and measure the signal-to-noise ratio on each channel as you reduce the input RF level. You would then see at what RF level the S/N ratio becomes some specified value -- say, 30dB down. The procedure is essentially the same, then, but it hasn't been done for stereo until recently. [The use of dBf measurements, 50dB quieting, and stereo noise performance were introduced some time after this discussion. See "Some FM Basics" -- IGM]

This also applies to other tuner measurements as well. In manufacturers' specifications in general, it's very difficult to discover whether distortion measurements, for example, have been made on a monophonic or stereophonic signal. In some cases, when frequency response is quoted as ±3dB, 50 to 10,000Hz, there is seldom any indication as to whether that is in the mono or stereo mode -- and there are differences in a lot of tuners.

IM: How about testing software? Is testing of magnetic tape in any way standardized, or even reliable?

EB: The difficulty in testing tape is obtaining some sort of reference tape that could be considered a standard. There are tapes available that are considered standard by particular groups, and all we can really do is compare a tape to this standard.

You can, however, get a tape that, compared to such a "standard" tape, has a falling frequency response, but a slight adjustment of the bias on your tape recorder can make it a perfectly acceptable tape to you. It would not be the same as the standard, but that doesn't necessarily mean that it is any better or worse.

IM: So for such a standard to be meaningful, all recorders would have to have the same bias.

EB: Yes, but they usually don't. Most manufacturers set up their bias for the variety of tape they favor, so when you buy a recorder you should find out what it has been adjusted for, and use only that tape if you want to get the best results. Various other tapes, of course, will be quite similar, but the average user cannot really pick out these tapes without some sort of measurement.

IM: What would constitute a "bad" tape for a user to buy?

EB: Something with ragged edges or holes, perhaps. That's about as far as the user can go.

IM: Yes, but there are differences in surface finish, for example, and this affects the wear on the head. Is there any way a prospective tape purchaser can select a high-quality tape in this respect?

EB: I don't think so, without the guidance of some test results.

IM: Is anybody testing for this particular characteristic?

EB: For wear? Not that I know of -- not in regular reports at least.

IM: Are there ways of doing it that you know of?

EB: Yes, there are. They are rather involved. I think that, in general, the tape manufacturers are probably doing these measurements, but I don't see very much quoted except, perhaps, with computer tapes or instrumentation tapes, in which there are quotations of abrasiveness and dropouts.

IM: And dropouts are a separate factor?

EB: Yes. The dropouts are usually quoted with digital computer tape, but not very often with audiotape.

IM: What about the mechanical reliability of cassettes? Is there any simple criterion that could be applied?

EB: There is some equipment made for testing the digital cassette -- which is similar to the audiocassette. This is a torque-measuring instrument, and it will measure the torque exerted by the reel of tape in the cassette, with the other reel free, and with some holdback tension applied. You can get a "torque profile" by running the cassette through the machine.

If you were to operate the cassette a number of times, you would notice some difference in the torque characteristics of it. This is something that I haven't yet seen done, but there is equipment available to do it. This could be quite a useful test.

IM: To return for a moment to a point brought up earlier -- the difficulty of relating actual measurements to the audible effects of the problems they disclose -- is there room for subjective evaluation, say, a sort of average of a number of listeners' reactions, in the formal measuring process? Might this not really be the most important evaluation tool of all?

FT: It's probably more reliable than trying to interpret different measurements. Yes.

IM: Presumably cold measurements with instruments are objective . . .

FT: . . . and they're reproducible, technically exact . . . and safe.

IM: But might there be a way to incorporate some sort of subjectivity into the testing procedure? In the first place, have you found there to be any sort of consistency in the reactions of different groups of listeners?

FT: In a gross sense, there is certainly a lot of consistency. A panel of listeners, in my experience, does a good job of differentiating good, fair, and poor products. But, within these categories, there may be products that, by some absolute measure that I don't have but can imagine, have equal deficiencies -- but different deficiencies. And, in such cases, one finds that different people, for different kinds of music, may be able to "live with" different deficiencies better.

So if you're restricted to one category of equipment because of price, or size, or some other constraint, it becomes a question of personal preference. But it's a preference of a sort that says, "I can better live with the defects of this product than of that product -- although I may well recognize that both are about equally imperfect."

EB: Do you think that various people's preferences will average out, for the most part? Or do we end up with a whole population that likes, say, boomy bass?

FT: I've not found that. They do average out to the extent that there is a clear differentiation, as I said earlier, between good, fair, and poor. A group of people will, on a numerical scale, tend to lump certain products in the "good" category, certain others in the "fair" category, and others in the "poor" category. But individuals may prefer one product over another within one category.

IM: So this could be used as a rough guide.

FT: Certainly a rough guide, and a reliable one. It's a question of resolution of the judgment.

IM: Could there be a way of relating the actual measurements to these subjective tests?

FT: You would have to establish a single-number "quality scale" for each parameter, such as distortion or frequency response. In other words, a certain piece of equipment could be said to have a frequency-response rating of nine out of ten, or 90%, and in distortion, it racks up a score of 70%, and in terms of transient response, it's 75 or 80%.

You can combine these scores as you wish, according to your weighting factors, which must be determined. If you can determine such weighting factors, it may be possible to come up with an absolute score -- a "box score" -- to describe the overall performance of that product. Then you're up against the matter of relating this box score to the box score produced by a panel of listeners.

IM: This heads us back to the question of whether everything that should be measured is being measured, to make any scoring system valid.

FT: I suspect not. As I said earlier, we tend to make the measurements that are convenient to make. This is one of our sins.

But even with the tests that are made, the problem arises of how to weight the relative importance of the various different dimensions. Is, for example, a 1 or 2dB variation in frequency response more or less serious than 2 or 3% distortion over the same frequency range? How do you relate the importance of 5% distortion at 10kHz to 5% distortion at 30Hz, or 50Hz? These are very difficult questions to answer. We need psychoacoustic data to sort it out, and it doesn't exist at the moment.

IM: Are there programs seeking such information?

FT: Very little is being done.

IM: So what is the solution? Is there any way for the average consumer to be able to tell, from test results, whether or not a piece of audio equipment is any good?

FT: On his own, he will find it difficult. The best solution is to have the measurements interpreted for him, both objectively and subjectively.

IM: This is where audio magazines come in, and perhaps we can end on this note.

Measurements of audio equipment, important as they are, are only raw data. Some audiophiles can read numerical test results, and glean considerable information from them, to be sure, but the average consumer wants to know what the measurements mean in terms of how a unit will sound, and whether a given piece of gear is a good buy.

Equipment reviewers are, or should be, in the position of working with the team actually performing the tests, so that the results can be combined with the reviewer's "feel" for that equipment, in a rounded evaluation.

Our reports, for instance, begin with a test of the performance of a unit under laboratory conditions. Then the reviewer takes it home and "lives with" it for a period of time, and writes what is, ideally, a sort of blend of his own subjective impressions and of the cold data.

And perhaps this is what we were discussing: a way to weight the measurements to take account of the subjective side of things. In preparing a test report, the "report" is just as important as the "test."

...Ian G. Masters
ian@mastersonaudio.com


MASTERS ON AUDIO AND VIDEOAll Contents Copyright © 2004
Schneider Publishing Inc., All Rights Reserved.
Any reproduction of content on
this site without permission is strictly forbidden.