advertisement


If the Quad philosophy is that an amplifier should be a straight wire with gain....

Given the state of play in modern amplification I doubt if distortion in the electronics is of much significance in the overall scheme of things especially when compared to the microphone and loudspeaker interfaces.

'Any road up' thinking about the expression 'a straight with with gain' I doubt if the 'straight' bit is important either :0)
 
I've recently been made aware of a 2023 paper by Kunchur which has some very interesting things to say about the human auditory system and implications for high end audio systems:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4437822



TL;DR: timing is more important that frequency, and we are one or perhaps two orders of magnitude more sensitive to timing errors than previously recognised. System response times and the ability to reproduce transients (including transient frequencies well above the audible range) are more important for the perception of music than frequency response. Old people, whose hearing may be 2 octaves down on ideal hearing (ie, hearing little or nothing >4.5kHz) are still able to perceive differences in the reproduction of transients. Also, system measurements (including cable measurements) may not capture sufficient information about rise and fall times.
This paper will be approached with caution by anyone who actually cares whether any of this is really true, as opposed to merely casting around for something to support their assumptions. Kunchur is a physicist hifi enthusiast not a perceptual scientist and he has form for getting stuff painfully wrong.
 
Ah, thanks. My professional expertise is in an entirely different field (regulatory policy and law), and I wouldn't have access to much of the reference material he cites, so am not in a position to argue for, or debunk. I read the paper with interest because it seems to provide some support for my own empirical, subjective experiences.
 
Given the state of play in modern amplification I doubt if distortion in the electronics is of much significance in the overall scheme of things

I think we would all agree that any musical recording incorporates a degree of distortion from the recording process by necessity.

Where we may differ is in the the level of distortion that we think is acceptable to add from our replay chain.

Is adding distortion from our replay chain merely additive though or a compound variant.


'Any road up' thinking about the expression 'a straight with with gain' I doubt if the 'straight' bit is important either :0)

Though if curved, the wire is longer and therefore exhibits greater measured resistance :0)
 
If you design a product for zero, or extremely low levels of distortion it will sound shite.

Or merely an accurate reproduction of how the artist recorded it ?

Maybe artists assume that audiophiles are going to add distortion with their replay chains and record with near zero distortion to compensate 🫣.

I have actually been to some live performances where I thought the artist may have just engineered their sound to be a distorted mess.
Including the purpose built sound theatre that is the Bridgewater Hall.

Maybe listening to artists with abysmal sound systems leads us to think that audio should sound like that ?
 
Post which model please, not all Topping DAC's were state of the art as can be seen from the IMD hump posted above.
As previously posted (from 2018) things have progressed greatly, in terms of measured accuracy.
When I see something like the ESS IMD hump, I wonder if they are using something like dither to mask low level distortions.
These masking tricks are sometimes to trick the measuring equipment.
I see this in the SMPS world with spread spectrum clock modulation to pass EMC testing. The noise is still there, but spread all over the place. Sometimes this still matters.
 
The question of "what level is audible?" is not well understood at all AIUI. I have seen it written in published technical texts that the only design target with any justification is zero.

There have been tests on the audibility of Total Harmonic Distortion (THD) but for music reproduction the results of are not that relevant. That's because underlying a system's THD measurement is non-linearity in the system's transfer function. THD is a symptom - not the underlying problem. If you hit the underlying problem with a broadband signal (like music) this also produces Intermodulation Distortion (IMD). It has been known for decades that total IMD in that situation is 20 dB higher than THD (x10 in voltage terms).

I have noticed that Wolfgang Klippel has published quite a lot of material on the subject of distortion, including IMD, produced by various non-linearities in loudspeakers. Some of that includes assessment of how annoying distortion is for each non-linearity. However I have only glanced at that material so I don't know how far the understanding has got in that case.

THD/SINAD is an overly simplistic metric that mashes total harmonic distortion and noise without informing how these two types of distortion behave throughout the audio spectrum.
It’s unfit for purpose.
For proper characterisation of performance we need a very comprehensive set of measurements.
 
This is taken from a post on Vinyl Engine some years ago. A poster named desktop, who used to work for Disney designing sound systems.

"When my hearing was last tested there was a noise floor calibration test done as part of it. But I asked the tech to continue with quieter and quieter signals, until they were much quieter than the noise floor. I could still pick out signals @-10db below the noise floor. Noise floors are complex and audio tones are not, so detecting the tones was easy for me. The tech herself tried to hear these tones that far below the noise floor and wasn't able to do it. I applaud Whitneyville because it sounds like he has preserved his hearing as well as possible (considering he is a shooting enthusiast). But since I have not ever heard that it is physically impossible for anyone in the world to ever use their ears to detect frequencies above 35KHz, I obviously need to go back to school on that subject."


You can test if something similar below:

 
When I see something like the ESS IMD hump, I wonder if they are using something like dither to mask low level distortions.
These masking tricks are sometimes to trick the measuring equipment.
I see this in the SMPS world with spread spectrum clock modulation to pass EMC testing. The noise is still there, but spread all over the place. Sometimes this still matters.

HQPlayer's developer Jussi Laako has been very vocal about the impact of ultrasonic noise in digital audio.
I'm currently using an Intona USB Isolator between Mac computer and RME DAC and the level of audibility of the improvement surprised me.
 
Why I was never keen on DSD and SACD - a wall of ultrasonic noise by design. Also to be found in many class D designs
 
Or merely an accurate reproduction of how the artist recorded it ?

Maybe artists assume that audiophiles are going to add distortion with their replay chains and record with near zero distortion to compensate 🫣.

I have actually been to some live performances where I thought the artist may have just engineered their sound to be a distorted mess.
Including the purpose built sound theatre that is the Bridgewater Hall.

Maybe listening to artists with abysmal sound systems leads us to think that audio should sound like that ?

certainly how the recording and mastering engineers thought it should sound given the environment they recorded it in.

knowingly adding distortion ‘to taste’ during playback i am not convinced by.
 
certainly how the recording and mastering engineers thought it should sound given the environment they recorded it in.

knowingly adding distortion ‘to taste’ during playback i am not convinced by.
At the volume levels common in studios, speaker low order harmonic distortion will be strong
 
Ah, thanks. My professional expertise is in an entirely different field (regulatory policy and law), and I wouldn't have access to much of the reference material he cites, so am not in a position to argue for, or debunk. I read the paper with interest because it seems to provide some support for my own empirical, subjective experiences.
Well quite. You will then surely see that he would not be a qualified expert witness if the issue were relevant to a legal determination. The person one would want to call would more likely be JJ ie the person who debunked his nonsense back in 2007. The person whose views I would really like to hear is Brian Moore (the Emeritus Professor of Auditory Perception at Cambridge, not the novelist and certainly not the rugby player).

I don't claim any expertise but I do actually care whether any of this stuff is true and back when I could be bothered I couldn't find any evidence of temporal resolution requirement within one channel in the microsecond range. The consensus view seemed to be back in 2007 that Kunchur had conflated inter-aural time differences with monaural time differences. I don't have easy access to the small number of relevant references in Kunchur's article, but it is noteworthy that they are not recent so there is nothing to suggest any change in the knowledge base in the relevant field. It's also noteworthy that they are dealt with in a rather cursory way (cf the more detailed approach to the bit about angular resolution in stars which I'm sure he does know lots about.) There are other references eg one to Fourier uncertainty which set the alarm bells ringing.

Nothing would give me greater pleasure than to learn that there was a genuine evidence base in psychoacoustic research which indicated a need for something more than 16/44 and that human hearing really was much finer than imagined. But I have spent a lot of time and bandwidth chasing this stuff in the past and it has generally turned out to be exaggerated at best.

One thing of which I am quite sure is that the gossamer thread of evidence which might at best support this stuff is many orders of magnitude weaker than the overwhelming evidence of the unreliablity of individual reports by people outwith structured tests of what they think they perceive. The relevant rules of evidence in the field (ie the threshold test of what consitutes evidence at all) are clear. No point tryign to fool onself about what the empirical evidence is. There is no mystery why no one rushes out to rethink electronic engineering or psychoacoustics based on audiophile reports.
 
...
Nothing would give me greater pleasure than to learn that there was a genuine evidence base in psychoacoustic research which indicated a need for something more than 16/44 and that human hearing really was much finer than imagined. But I have spent a lot of time and bandwidth chasing this stuff in the past and it has generally turned out to be exaggerated at best.
...
I have not read the paper (yet). The subject raised is very interesting to me because I think the human hearing system deviates far from the principle of being a "straight wire (with gain)" linear system.

The idea that the very non-linear mechanism of human hearing might result in some independence between time perception and frequency perception comes up every now and then. In linear systems this does not work theoretically AFAICS. But I don't know much about non-linear systems.

However I am glad to see someone independently counselling caution with this author. I read a paper of his some time ago and I thought the conclusion he drew could not be supported because the experimental method left the door open to other more likely possibilities. I thought it might have just been my misunderstanding. But if others see problems with his audio publications then perhaps there might be something in my concerns after all.
 
I have not read the paper (yet). The subject raised is very interesting to me because I think the human hearing system deviates far from the principle of being a "straight wire (with gain)" linear system.

The idea that the very non-linear mechanism of human hearing might result in some independence between time perception and frequency perception comes up every now and then. In linear systems this does not work theoretically AFAICS. But I don't know much about non-linear systems.

However I am glad to see someone independently counselling caution with this author. I read a paper of his some time ago and I thought the conclusion he drew could not be supported because the experimental method left the door open to other more likely possibilities. I thought it might have just been my misunderstanding. But if others see problems with his audio publications then perhaps there might be something in my concerns after all.
There are undoubtedly some non -linear (non LTI) effects. But I'm particularly wary of the "human hearing is somewhat non linear therefore I can imagine whatever I like about it " gambit.

For example the ear's dynamic range adjusts when loud sounds are played so it is time variant. This is taken into account in the overall 120dB range but actually you couldn't hear 1 phon anytime soon after hearing at 120dB. In reality the 120dB dynamic range rather overstates things in practical terms.

Also masking might be seen as a form of additive non-linearity in the case of simultaneous masking and Time variance in the case of temporal masking

But neither of these ideas supports any idea of human super hearing. Quite the contrary.
 
I started to wonder whether I was going mad and had misremembered that the temporal resolution results I could recall were in the millisecond range. Random search pulled up this
comparing musicians and non-musicians with 3 different temporal resolution tests -duration discrimination, pulse train duration discrimination and gap detection. I think you can see from figs 1 and 2 that they lie in the ms range for both musicians and non-musicians.

I claim no expertise but does this support the notion that humans have temporal resolution abilities which could not be captured by 16/44? That humans have unlimited and uncharted temporal resolution. That there is soooooooo much "we" don't know about the uncanny hearing ability of humans?
 
Last edited:
I started to wonder whether I was going mad and had misremembered that the temporal resolution results I could recall were in the millisecond range. Random search pulled up this
comparing musicians and non-musicians with 3 different temporal resolution tests -duration discrimination, pulse train duration discrimination and gap detection. I think you can see from figs 1 and 2 that they lie in the ms range for both musicians and non-musicians.

I claim no expertise but does this support the notion that humans have temporal resolution abilities which could not be captured by 16/44? That humans have unlimited and uncharted temporal resolution. That there soooooooo much "we" don't know about the uncanny hearign ability of humans?
Not really, to capture a 1ms interval, you just need 2 kHz sampling. 16/44 would sample the briefest interval heard about 50 times.
 
But if others see problems with his audio publications then perhaps there might be something in my concerns after all.

If you are referring to MK then yes, he has a track record of perception-related papers dressed up in impressive-looking methodology and statistics invariably coming with suspect or outright wrong conclusions in support of 'high-end audio'. At one point in time one even had to wonder if the author properly understood sampling theory.

Before bashing MQA got popular there were whole audio tribes bashing MK on the web forums.

A lot of caution is, indeed, advised.
 
"I could still pick out signals @-10db below the noise floor. Noise floors are complex and audio tones are not, so detecting the tones was easy for me."

If should be fairly easy for anyone, including (dithered) digital.

The mistake often made is that the total signal strength of the noise 'floor' is compared to the signal strength of the single tone.
But the noise signal is summed over the bandwidth of interested, say 20kHz, whereas the tone is singular.

The portion of the noise signal that is spectrally near enough the tone so as to have masking power over it is but a fraction of the total noise signal. Hence the tone can be picked up relatively far below the (summed) noise. There is no mystery here.


It gets challenging when the sound of interest and the spectrally-near portion of the ambient noise get similar in strength, so that masking does occur. Then you are on the domain of trained sonar operators, skilled in unearthing meaning from noise, based on their prior experience/expectations of the sonic fingerprints of their targets of interest.
 


advertisement


Back
Top