advertisement


DACs -- Bit perfect +filters

For each sample the sum of all the sincs ahead in time and behind is calculated. Theoretically, you should sum the sincs to plus and minus infinity either side of the sample and that would produce a perfect brick wall filter response. But the far away ones have a very small affect on the value of the present sum so people live with a finite number of filter taps.
 
I assume GT was referring to this

From the article:
It’s unlikely to be the final word. Joseph Nagyvary, a professor emeritus at Texas A&M University in College Station who studies the chemistry of violins, isn’t convinced. He says that players need to play a violin for weeks to evaluate it fully, and that the study did not take into account that Stradivariuses vary in tone. “Experts know well that the 600 or so extant Strads vary vastly in their tone quality due to their playing and preservation history,” says Nagyvary, who is also a maker of modern-day recreations of Stradivarius and Guarnerius violins.

However, Giora Schmidt, one of the musicians who took part in the study, says he doesn’t believe it was biased. “One can argue that the old instruments selected were intentionally ‘weaker’ than the new ones chosen, or that the new instruments were set up optimally versus the old, which were ‘tired’,” he says. “But I think these are elements that players face each time they walk into a violin shop.”

Fritz says some soloists were frustrated that they did not get to see the violins at the end of the study, and were surprised to learn that their favourite violins were new ones.


Its highly likely the musicians got used to the sound of their own "newer" violins hence they preferred them. If you got to see a top violinist like Anne-Sophie Mutter, BTW she owns 2 Stradivarius, you will hear why these instruments are so highly revered.
 
From the article:
It’s unlikely to be the final word. Joseph Nagyvary, a professor emeritus at Texas A&M University in College Station who studies the chemistry of violins, isn’t convinced. He says that players need to play a violin for weeks to evaluate it fully, and that the study did not take into account that Stradivariuses vary in tone. “Experts know well that the 600 or so extant Strads vary vastly in their tone quality due to their playing and preservation history,” says Nagyvary, who is also a maker of modern-day recreations of Stradivarius and Guarnerius violins.

However, Giora Schmidt, one of the musicians who took part in the study, says he doesn’t believe it was biased. “One can argue that the old instruments selected were intentionally ‘weaker’ than the new ones chosen, or that the new instruments were set up optimally versus the old, which were ‘tired’,” he says. “But I think these are elements that players face each time they walk into a violin shop.”

Fritz says some soloists were frustrated that they did not get to see the violins at the end of the study, and were surprised to learn that their favourite violins were new ones.


Its highly likely the musicians got used to the sound of their own "newer" violins hence they preferred them. If you got to see a top violinist like Anne-Sophie Mutter, BTW she owns 2 Stradivarius, you will hear why these instruments are so highly revered.
Comparison is the thief of joy and that article, like so many people in hifi who are so obsessed with price (specs) that they forget the value which transfers into magical music.
 
@GordonM Have you tried reading DSPGuide- I think that will probably explain it. Most dacs have way fewer than 44k taps and they don’t actually use a sinc function as such. But each output value is indeed the sum of the output at that time of each of the taps with the filter applied to it. Don’t forget that the “sinc” of each sample value will in your case be 44k values (although strictly it should be an infinite number as @John Phillips points out) . In any event I don’t think it has been done traditionally in the way you suggest (although it could be nowadays). Anyway have a look and see whether you find the answer in DSPGuide. If not @Jim Audiomisc may be able to point you the right way
[Edit I can’t now find the post to which I was replying ]
 
@GordonM Have you tried reading DSPGuide- I think that will probably explain it. Most dacs have way fewer than 44k taps and they don’t actually use a sinc function as such. But each output value is indeed the sum of the output at that time of each of the taps with the filter applied to it. Don’t forget that the “sinc” of each sample value will in your case be 44k values (although strictly it should be an infinite number as @John Phillips points out) . In any event I don’t think it has been done traditionally in the way you suggest (although it could be nowadays). Anyway have a look and see whether you find the answer in DSPGuide. If not @Jim Audiomisc may be able to point you the right way
[Edit I can’t now find the post to which I was replying ]
Sorry, I deleted it cause I thought it was a dumb question on re reading, and would be shot down in flames.

Found DSPguide but it's very technical and a brief look didn't seem to have what I was after but will have a more in depth read later.
 
Perhaps worth adding that ADCs need to have included 'added noise' when sampling. This then 'dithers away' the quantisation into being 'noise' at about the 1-bit level. Using a sinc reconstruction AND an ADC that applies a sinc filter when sampling gives the theoretically 'perfect' result of signal+noise transfer. But as said above, real ADCs and DACs come close, but can't be perfect. Nor can anything else. Including listening live. :)

In reality the choice of mic can do far more to 'alter ' the sound than a reasonable ADC + DAC. As can your speakers and room. As can the dimwit waggling the sliders in the recording studion. :) If you want better recordings - get a better studio engineer, etc. :)
 
Sorry, I deleted it cause I thought it was a dumb question on re reading, and would be shot down in flames.

Found DSPguide but it's very technical and a brief look didn't seem to have what I was after but will have a more in depth read later.
It wasn’t a dumb question at all. I’m really glad when I come across someone who is actually interested in understanding how stuff works. I’m not a professional but I am sufficiently interested to have taught myself the basics using some text books. Unfortunately, being science/maths, it does require some effort and one does have to understand one thing before one understands the next.
if you are interested perhaps you could try reading @Jim Audiomisc ‘s text book “Information and Measurement”which is available for free. (Sorry don’t have link to hand). I think the DSPGuide is good too, but it might help to start at the beginning.
The sampling theorem is amazingly elegant, but it is also completely unintuitive IMHO.

I would add one further word of caution- don’t expect it to tell you much about what you “hear”. That’s something else altogether.
 
It wasn’t a dumb question at all. I’m really glad when I come across someone who is actually interested in understanding how stuff works. I’m not a professional but I am sufficiently interested to have taught myself the basics using some text books. Unfortunately, being science/maths, it does require some effort and one does have to understand one thing before one understands the next.
if you are interested perhaps you could try reading @Jim Audiomisc ‘s text book “Information and Measurement”which is available for free. (Sorry don’t have link to hand). I think the DSPGuide is good too, but it might help to start at the beginning.
The sampling theorem is amazingly elegant, but it is also completely unintuitive IMHO.

I would add one further word of caution- don’t expect it to tell you much about what you “hear”. That’s something else altogether.

It's kind of you to say that, thanks. Although I still think my deleted post was way off the mark.
So I had a read of "Information and Measurement" and the relevant part is Chapter 7 Sampling Theorem.

And the sinc(x) function is:
sinc-func.jpg

The key sentence is: "Given a set of samples, xi , taken at the instants, ti , we can now use expression 7.15 to calculate what the signal level would have been at anytime, t, during the sampled signal interval."

This is probably obvious to anybody who's delved into it but not us newbies!

There are 65 samples in the example but what I still don't understand is where these 65 samples are inserted into the formula to regenerate the waveform.
Is it the left hand part of the formula x{t} = ?

It's nearly 60 years since I did calculus at school so it's all a blur now.
(As an aside, the only subject I failed was German and it's the only subject I still remember quite well!).

In real life there will obviously be more than 65 samples and is this the same as "taps"?

Again in real life if we have a stream of music which is about 1 minute long then that's => 2.6M samples.
Does the dac take each sample and generate the waveform between it and the previous sample using x samples before and after, then move on to the next sample and repeat?
All I want is a general idea of what's happening and I'm sure it would help others who are in the same boat as me.
 
I
Is it the left hand part of the formula x{t} = ?

Yes. It means that at *any* instant within the time range covered by the samples you use all the sampled values as per the sinc function sum. The sampling is a 'complete record' from a pure maths POV. Hence the significance of the sinc function in any consideration.

People do also tend to use terms like 'taps' for the 'samples in scope'. This is because most practical DACs use a finite number of samples 'near' to the 'current instant'. i.e. don't do the sinc calculation using *all* the samples. This is why there is a tendency to tweak the filter 'shape' away from being a perfect sinc, and the result is a non-flat frequency response. Usually a slight 'droop' at HF.

This is also all why some designers like Robin Watts got for *lots* of 'taps' - to get as close as possible to the sinc ideal of using them *all*. Using them all gets a tad impractical for something like a Bruckner Symphony or The Ring Cycle. 8-]
 
The r2r thread got me thinking, so if we feed a DAC a bit perfect FLAC/WAV file but that DAC uses filters is the sound still bit perfect or a true representation of what was in the original source file?

Also, find it strange that ASR get all excited about SNR when anything better than 90dB is moot. A much better measurement or something I'd be more interested in is the settling time of the DAC or measurements for its slew rate, delay and settling time

Personally I'm not an objectivist so DAC choices are made by listening to them for a few weeks but the theory still interests me.

PS - No I won't post this question on ASR as I'm not a member and can't stand the place :)

If I break this down it might be easier to understand. The signal being recreated must pass through the voltages indicated by the sample points, but the question is, what should happen *between* the sample points? There's an infinite of possible signals that pass through these values, but do everything conceivable curve between those points.

Sampling theory however has another restriction - the signal being sampled must contain no frequencies great than half the sample rate (so for 44.1Khz CD quality, that means no frequencies greater than 22.05Khz). With this one extra constraint, we end up only one possible continuous signal that passes through all of the samples, and contains no higher frequencies. This is the job of the DAC, to create this continuous signal. The maths as to why there is only one continuous signal that meets this requirement is difficult, but for those interested go dig into the nyquist-shannon sampling theory. Interesting stuff if your maths is up to it.

So, the job of the filtering in the DAC is to attempt to make the continuous signal closer to this correct continuous function - notice there is no choice about it, that is the one and only valid representation of the original signal.
 
To add just a bit more (AIUI, at least), in 7.15 of @GordonM's image the continuous analogue audio, x(t), is reconstructed by taking each digital sample, xi, multiplying it by an analogue sinc function (see my avatar) whose big peak is aligned with the sample time. Then you add together all of these sinc functions, one per sample to get your analogue audio.

This first of all means that each big sinc peak becomes the analogue value of the audio signal you want to reconstruct, at the time of the sample.

Then the amazing magic of the sinc function is that all other sample times it has a value of zero.

Adding together these scaled, shifted sinc functions, each with its big peak aligned to the specific sample time, the audio signal you construct sees zero contribution from a given sample at any time other than its proper sample time, and at its proper sample time the contribution to the reconstructed audio is the value of the sample.

Thus you can guarantee that the reconstructed audio waveform is equal in value to the sample value at all sample times. No more, no less. And as is written above, this reconstruction is unique under the condition that the audio signal only has frequency content below half the sample frequency.

I never cease to be amazed how elegant and useful this piece of mathematics is.
 
For those interested in code, I wrote a very simple sinc interpolator a few years back, which can be found here:


The fancy fast implementations are overkill for some uses, and aren't very good as a teaching aid. I wrote the above one with the intention that it was fast enough for the uses it was aimed at (resampling samples in a synthesiser) and simple enough to be understood by a DSP engineer interested in learning about this stuff. Of course you need to understand C++.

I should mention it uses a raised cosine window function to restrict the number of zero crossings, defaults to 50 which gives I think over 90db of dynamic range to the artefacts.
 
I never cease to be amazed how elegant and useful this piece of mathematics is.

The 'hidden killer' is the way such a series can *accurately* specify an inter-sample peak *way* above the max sample possible sample level. Not all DACs have coped with this. And is a hidden 'iceberg' in some cases when you 'upsampled' or 'resample'. Even 44.1k <> 48k can fall over this if the input hides such a 'feature'. 8-}
 
It's kind of you to say that, thanks. Although I still think my deleted post was way off the mark.
So I had a read of "Information and Measurement" and the relevant part is Chapter 7 Sampling Theorem.

....
All I want is a general idea of what's happening and I'm sure it would help others who are in the same boat as me.
Ok. Others who understand this better than I do have already answered better than I can , but I will have a go myself, because I think I may have an idea where this is confusing, and also I have an idea how to loop this back to your original question. Anyway, those who are better qualified than I am will no doubt cringe at the impecision and inaccuracy of what follows, but it is my best shot.
It's kind of you to say that, thanks. Although I still think my deleted post was way off the mark.
So I had a read of "Information and Measurement" and the relevant part is Chapter 7 Sampling Theorem.

And the sinc(x) function is:
sinc-func.jpg

The key sentence is: "Given a set of samples, xi , taken at the instants, ti , we can now use expression 7.15 to calculate what the signal level would have been at anytime, t, during the sampled signal interval."

This is probably obvious to anybody who's delved into it but not us newbies!

There are 65 samples in the example but what I still don't understand is where these 65 samples are inserted into the formula to regenerate the waveform.
Is it the left hand part of the formula x{t} = ?

Ok- to answer your question, not exactly. The left hand side of the equation simply means "any continuous-time function" (e.g. here an "analogue" continuous voltage changing over time (like on an oscilloscope). Now the right hand side means...... will be equal to the sum (that's the big E -actually a capital sigma) of each of the time-spaced values of that function (i.e. the sample values or the voltages measured at each sampling instant) with the sinc function applied to that sample value.

So the answer to your question is that each sample is inserted into the sinc function on the right hand side one at a time in order to recreate the left hand side (ie x(t)) from a discrete set of sample values of x(t) (those samples are numbered 0 to K in the equation). and don;t forget that having fed each of the K samples into the function one at a time, the results are then added up (that's what the Sigma means) to get back to x(t).

So back to the equation each of those samples (i.e. values of x(t) at the sampling instant) will now generate a scaled sinc function; and each of those sinc functions is a continuous function in time -each one being scaled and time shifted. (the central lobe is at the time of the sample in question. This is as explained by @John Phillips the beauty is that as he points out each such sinc function has a max at its own sample time, but is zero at the other sample instants. So when you add them up they don't "interfere" at the sampling instants. But of course what we are really interested in is the way it enables us to calculate the values in between the sample times (i.e. between the dots) when each of the sinc functions from each of the sample values will contribute .

However this is a little bit confusing because whilst it is dealing with sampling (like digital music which is just a set of sample values representing a voltage/time relationship) it assumes that the sinc function is continuous i.e. "analogue" and actually as @Jim Audiomisc points out, that both the sinc function and the set of samples starts at the beginning of the universe and ends at the end of the universe.

In practice we can only have a limited time in which to take into account samples (like the mere 65 samples shown in the picture) and we are not going to calculate the whole sinc function for each sample. But also

last thing- the sinc function is in fact just another way of saying a perfect "brick wall filter" which lets through all the frequencies up to one point and completely cuts out all the frequencies above that point.*

No such analogue filter exists so instead we do the filtering mathematically ie by approximately calculating the sinc function . And we don't do it by calculating the whole sinc function for each sample- we use a digital filter which calculates the values at other sampling instants (i.e. we fill in more dots in figure 7.3). When we talk about filter taps we are actually referring to a digital filter - one which does not calculate the whole sinc function but only its value at further sample times (i.e. more dots) in between the time of the original samples we had. And it does so by calculating those values not from all the (infinite number of) sample values but from a number of values equal to the number of filter taps.

So recap at that stage we are not drawing the line in figure 7.3b we are just filling in more dots. Equally we don; t have to calculate the whole sinc function for each sample value because we don’t have to calculate all of its influence on every point in time, only at the times in the new sample instants (dots) which we are calculating.

Let’s assume for now that the filter is a sinc function but with 64 taps (it’s a time windowed sinc function) . Then what the filter does is to calculate the sample values as at the points (in time) between each existing sample: to do so for new sample, you add up the value at that time of the sinc function of each of the 32 samples before that time and the 32 samples after that time.
So each new sample value generated by the filter (a new sample value between the existing samples) is therefore the sum of 64 numbers (some positive some negative) generated from the preceding and succeeding 32 samples.
2 things to note - 1) we now have more sample values than before (like having 129 dots or so in fig 7.3b) - we have to, otherwise the digital interpolation filter can’t work. We have upsampled/oversampled.
2) we are still in the sampled time domain not the continuous time domain and at some point we need to apply an analogue filter to turn this into continuous time i.e. interpolate all the continuous values between the samples. The more we upsample/oversample first, the easier this is because the dots get closer.

*TBH this explains this all much better than I have https://lavryengineering.com/pdfs/lavry-sampling-theory.pdf
 
Last edited:
My understanding is that but perfect can only be achieved by running the signal through a NOS non-SDM DAC.
 
'but perfect' requires hours in the gym.

Your comment is moot, IMO, as what comes out of the DAC is not bits anymore. You can present bit-perfect data to an oversampling DAC and it is bit perfect up to the point it is converted to analog.
 
'but perfect' requires hours in the gym.

Your comment is moot, IMO, as what comes out of the DAC is not bits anymore. You can present bit-perfect data to an oversampling DAC and it is bit perfect up to the point it is converted to analog.
🤣
In my view it’s not my comment that is moot, it’s the concept of bit perfect that is moot. Another one of those useless audiophilie buzzwords…
 


advertisement


Back
Top