advertisement


The truth about bit depth in digital

gez

pfm Member
An explanation (and proof) as to how digital bit depth has no impact on the quality of the audio, except to impact the level of the noise floor.

As someone who used to be a fervent believer that more bits meant more "detail" and "resolution" in digital audio, this demonstration was a big eye opener, especially where he inverts one of the sources and plays both at the same time, and is left with just noise. NB: if there was any diffference in the musical signal between the two sources you would hear some kind of music related sound (heavily distorted or frequency limited in some way), the fact that the result is pure noise proves that the musical signal in both sources is identical.

 
An explanation (and proof) as to how digital bit depth has no impact on the quality of the audio, except to impact the level of the noise floor.

As someone who used to be a fervent believer that more bits meant more "detail" and "resolution" in digital audio, this demonstration was a big eye opener, especially where he inverts one of the sources and plays both at the same time, and is left with just noise. NB: if there was any diffference in the musical signal between the two sources you would hear some kind of music related sound (heavily distorted or frequency limited in some way), the fact that the result is pure noise proves that the musical signal in both sources is identical.

AIUI ... bit depth affects the noise floor - whereas kHz affects resolution & the HF cutoff.
 
An explanation (and proof) as to how digital bit depth has no impact on the quality of the audio, except to impact the level of the noise floor.

As someone who used to be a fervent believer that more bits meant more "detail" and "resolution" in digital audio, this demonstration was a big eye opener, especially where he inverts one of the sources and plays both at the same time, and is left with just noise. NB: if there was any diffference in the musical signal between the two sources you would hear some kind of music related sound (heavily distorted or frequency limited in some way), the fact that the result is pure noise proves that the musical signal in both sources is identical.

Yes. If this were not true then dsd would not work. Also any modern dac will in fact be converting using only a few bits (albeit at a very high sample rate). The really clever thing is that not only is the error caused by the limited bit depth only noise (provided the input is dithered) but you can actually decide what sort of noise it is so as to make it less audible or even inaudible.
@Werner did a beautiful demonstration of this years ago. I wonder whether I can find the link.
 
ah here it is
if you listen to the 3 links -undithered 4 bit, dithered 4 bit, and noise-shaped dithered 4 bit then you will see the point. These are made with only ordinary 44.1kHz sampling (but each sample is only at 4 bit resolution) whereas with a (much) higher sample rate even 4 bit sampling can be made to sound as good as 16 bit 44Khz cd resolution .
 
An explanation (and proof) as to how digital bit depth has no impact on the quality of the audio, except to impact the level of the noise floor.

I see a contradiction right there, 'proof' that you should never trust a piece starting with "The Truth About...".
 
An explanation (and proof) as to how digital bit depth has no impact on the quality of the audio, except to impact the level of the noise floor.
...
The video is excellent.

If you use the correct techniques you can preserve in digital audio the detail and resolution of the original (band limited) analogue audio other than adding a random noise floor that depends on the bit depth you choose.

This means that digital audio done right works exactly the same as analogue audio done right. The only difference is the different noise floors added by analogue electronics and by the analogue medium.

It is interesting that the noise floor of pristine vinyl of the very best quality is rather higher than the noise floor of properly dithered 16-bit digital audio (reference including high-signal-level consideration too). This is true but to me does not matter - I have heard excellent quality vinyl reproduction that is perfectly good enough to fully enjoy the music it carries without worrying about noise. The paradox is that there's a lot of chatter and worry on audiophile venues about digital noise and much less about analogue noise.
 
The video is excellent.

If you use the correct techniques you can preserve in digital audio all of the detail and resolution of the original analogue audio other than adding a random noise floor that depends on the bit depth you choose.

This means that digital audio done right works exactly the same as analogue audio done right. The only difference is the different noise floors added by analogue electronics and by the analogue medium.

It is interesting that the noise floor of pristine vinyl of the very best quality is rather higher than the noise floor of properly dithered 16-bit digital audio (reference including high-signal-level consideration too). This is true but to me does not matter - I have heard excellent quality vinyl reproduction that is perfectly good enough to fully enjoy the music it carries without worrying about noise. The paradox is that there's a lot of chatter and worry on audiophile venues about digital noise and much less about analogue noise.

I've read a few times that a higher noise floor may actually sound nice to some people (euphonic).

The high(ish) noise floor one finds in some digitalised classical music of vintage analogue recordings is a bit obvious and detracts from the realism somewhat in my experience, particularly when compared to a good modern digital recording.
I don't really know what woul be the best adjective to describe it, maybe 'hazy'?
 
John Lavry of Lavry Engineering wrote the following about noise floor many years ago:

Can you hear under the noise floor?

The engineering definition of the "noise floor" is a measure of the total combined noise energy
residing at all frequencies simultaneously. While such a definition can serve well for comparing
signals or equipment, it may be rather misleading as an indicator for hearing capabilities. The ear
"does not listen" to the energy present at all frequencies simultaneously. The ear-brain combination
is a very competent tool for comparing the energy at a given frequency to the energy levels in the
surrounding frequencies. It can hear much lower signals then indicated by the total noise figure. The
frequency plots (FFT plots) provide a very good estimate of how far down you can hear a steady tone.
If you can see a tone above the noise floor, you will not hear it. In other words, the proper "yard stick"
is the noise floor density (energy at each frequency).

Undithered signals behave erratically. We have shown a case where an almost negligible signal is
"amplified" to become 1 LSB square wave. One may hear a -200 dB tone with a 16 bits undithered
system. One may choose to view the opposite case, when both the signal and noise are gated off,
as the hearing threshold of an 16 bit unditherd system (about -96dB). Both are "special cases" and
an undithered system can not provide a good standard for hearing sensitivities.

Dithered signals provide a "constant" noise floor, independent from signal and DC offset.
Measuring a triangular dither with a "dB meter" shows a reading of -93 dB for 16 bits. The energy
density (at each frequency) is at about -125 dB. Can you hear under the noise floor? You can hear
30dB below your "meter", all the way down to the noise density in the surrounding frequencies.
Reexamination of the frequency plots shows that you can hear a 16 bits dithered signal down to about
-125 dB under full scale.

Some manufacturers choose to view the special case of undithered signal gating as a the 16 bits
hearing threshold. One should not confuse the "gating threshold" of unditherd system with the noise
density of a dithered one. The "special gating case with 1/2 LSB of DC" occurs at about -96 dB. The
noise density (per frequency) for dithered signals is almost 30 dB lower.

The ability of the dither "to to bring the gated signal back" is shared by all types of dither.
Rectangular, Nyquist and triangular all perform the task within about 3 dB of each other (a range of
about 1/2 a bit). Beware of claims for "a special ability" of a specific type of dither to provide "3 -4
more bits". The 30 dB or so of dynamic range "beyond" the gating threshold is not unique to one type
of dither. It is shared by all types of dither and is not to be confused for additional bits. The proper
criteria for dither quality is its ability to eliminate distortions and noise modulation.

Noise shaping improves the noise floor:

The ears can hear music energy only while above the energy in surrounding frequencies. The
noise floor limitation (due to limited available bits) can be reduced by a noise shaper. The noise
shaper reshapes the frequency content of the quantization errors: it moves noise from hearing
sensitive regions (such as 2-4KHz) to less sensitive regions (such as 15-22KHz),. The process
trades off a better noise density floor where it counts for increased noise where it matters less. We
will not deal with how to choose from the a selection of available number of noise shaping "curves",
all based on psychoacoustic research under various condition.

The following discussion will use a noise shaper from a paper titled "Minimally Audible Noise
Shaping" by Stanley Lipshitz, John Vanderkooy and Robert A. Wannamaker (the leading experts in
the fields of dither and noise shaping). The plot below shows that the noise is "shaped" according to
some hearing sensitivities curve. The process includes dither and the shape of the noise is constant
and is not dependent on the signal or DC offsets.

The initial introduction of the modern concept of noise shaping encountered some resistance from
dither hardware manufacturers, confusing some users by comparing it to equalization and claiming
that it is unnecessary because most recording work yields no better then 90 dB outcome. Noise
shaping does not equalize the signal. The quantized signal is left untouched. Noise shaping
processes only the quantization errors (noise). As explained earlier, The "90 dB of dynamics"
argument is improper because it addresses the combined total energy across the audio frequency
band, while the real goal is to improve the noise density at each frequency.


"20 BIT EQUIPMENT FOR 16 BIT WORK?", Dan Lavry 1997
 
James D. "JJ" Johnston on ASR:

If we're talking about a listening room in a home, 130dB is entirely excessive. For instance, if the masking level in a room is 20dB SPL, and the peak level out of your stereo speakers is 115dB, you only need 115-20 = 95dB noise floor relative to peak.

Yeah, we have rooms (industrially built) that are a whole lot quieter than that. But we're also not using average equipment there, either, we're not.

The only thing, of course, that one must recall, is that generally both room noise and equipment noise floors are not flat, i.e. they are not white noise, or anything like that. Ergo, it is likely that you'll need more than that 95dB *if* your noise is, for instance, mostly lower frequencies, below 200Hz or so. (this is typical)

On the other hand, if you're old like me, you probably don't hear much at 0dB SPL any more, either.

In a near-perfect setup, where the noise floor in the room is the noise floor of the atmosphere, which works out to about +6dB SPL white noise, or about -15dB SPL in the band around the most sensitive part of the ear, then you may need more total SNR. *BUT* most recording venue's will just be adding in the noise you're not.

I have calculated, several times, that about 18 bits will suffice for home listening under almost any circumstance at all. That works out to 18*6.02 dynamic range, even taking into account noise floor with a common spectrum.
 
I've once posted a link to a speech inteligilbility test in background noise.

Digit Triplet Test
This test determines how well you can understand speech in background noise. It is simple and quick to complete. It can be performed with headphones or loudspeakers, but headphones are recommended.
During the test you will hear different 3-digit combinations in background noise that will vary in volume.
Input the 3-digit combinations into the keypad, if you are unsure of what you have heard then make a guess.


https://pinkfishmedia.net/forum/thr...d-the-text-with-background-noise-test.274142/
 
I'm posting to check that I understand this stuff. Correct me if I'm wrong.

Bit depth is word length, right? So when people talk about 'quantisation error' they are talking about the last digit (or so) of a long string of ones and zeros. Digital is binary - so it has to decide whether that last digit (or so) is a 1 or a 0. It can't be 'mostly 1'. So some signal is lost as quantisation error, and this is loss is greater when the word length/bit rate is low. Fortunately, dither compensates for this loss by introducing noise to cover the quantisation error. In effect, you swap out the last ones and zeroes with random values, giving you noise instead of quantisation error. So with low bit rates, because the word-length is shorter, you get greater quantisation error and you need to dither to cover this with greater noise. Have I got this right?

I've also read that at 24-bit, the existing noise in electronic circuits (thermal roar, etc) is sufficient to do the same job as dither - the quantisation error is low enough that it is buried beneath the noise floor of the electronics. Therefore, whether you need to dither at 24-bit is a digital audio topic of debate.
 
It’s like say you have a ruler divided up into 16 small divisions.

You draw a random graph on paper and use the ruler to measure how high or low the graph is.

Well your 16 graduations aren’t much. So, each time you measure along it you have to pick the closest one. It’s close, but not quite correct. Each time there’s a little error, sometimes more sometimes less it your graduation is closer. The errors are random, the measurement could be really close to the line or it could be nearly 1/2 a graduation out.

So if you take your measurements and draw out a fresh graph from the original it’s close but a little bit different. If you were to compare the two and carefully plot the precise difference you’d get a third graph - that’s the noise.

It’s easy to see that if with some time on your hands you made up a new set of graphs with a ruler using 32 or 64 graduations each time those little errors would be less

And the third noise graphwould be lower.

It’s called ‘quantization’ as you’re turning the graph into a number ‘quantity’ and the error depends on how many numbers you make available to use for the measurement.

You were right in that it’s like if it’s a 1 or a 0 but really it’s if it’s a 255 or a 256 when really if was 255.87

It’s amazing to me that 16 bit is good, that gives 65,536 lines in your ruler. To me I wouldn’t think that’s enough.

The noise shaping I don’t understand (or have forgotten) but the gist is if you look at that noise graph that’s come up via the process and think of a cunning way to move the noise into an area that can’t be heard and is simpler to filter away (ie higher up).

That’s how DSD works, there’s a ton of noise but it gets pushed up into the many kHz and filtered off. (I think)

If anyone’s got a good explanation I’d like to be reminded of it. Thanks!🙏
 
I suspect the ‘quality’ of noise you get in a digital system is different to that in an analogue system. And I suspect that the brain reacts to the ‘unnatural’ nature of digital noise, whereas it is more capable of ‘tuning out’ analogue noise from vinyl sources because it has its origins in physical processes which the brain encounters in nature.

I also think the Lavry piece, upthread, is interesting because I believe the sort of improvements I’m hearing in my system, due to supports and cabling, affect the noise floor. And there is musical information below the noise floor which it is possible to perceive, so lowering that floor is likely to bring it out better.

I also think that the biggest issue with noise is probably some form of intermodulation with the signal.
 
You’re also correct in that up at 24bits can’t be fully reproduced, or if you like that noise or error in the measurements is smaller that the noise the kit produces itself.

24 bits gives 16,777,216 graduations on the measurement ruler.
 
There is nothing magic about digital quantisation noise compared to thermal noise. Badly implemented digital where the noise gets modulated can possibly be audible.
 
There was a nice old demonstration up at my University in St Andrews physics department where you could turn a dial and change the number of bits. You could hear the noise and artifacts coming in- sounded much as you’d imagine ‘digital’ problems would sound. I’ve no idea how it worked, maybe by just truncating off the lower significant bits but it was a fun demonstration.

@Jim Audiomisc might remember it.
 
vector-illustration-groundhog-day.jpg
 
When the digital signal is read back the error on the amplitude equates to an error in the time domain, the quantisation noise, and this limits the bit depth in practice so that your 16 bits ain't worth the paper they're written on. That's not to say that you won't still have enough bit depth to capture the dynamics of the vast majority of recordings.
 


advertisement


Back
Top