advertisement


Noise Shaping

In case anyone cares, after some more listening, for now I've settled on a "25% of the way to minimum phase" filter (Archimago's is 10%).
 
Now using intermediate phase (like "-p 25" in SoX)! That's means half way between minimum and linear phase in terms of phase distortion.

Before I was treating linear phase as good/home base. After more listening, I've now decided minimum phase is good/home base for me. Extrapolating from Archimago's blog, "99% of minimum phase goodness" for half the phase distortion.

Will report back long-term.
 
Last edited:
Plotting spectra in Audacity is a bit misleading sometimes. Even when no energy at 20kHz makes the chart for the track, I've plotted a perceived event e.g. a cymbal hit, and suddenly energy of 20kHz does make the chart for that.

At the bass end speakers phase distort in the opposite direction; there'll be some off-setting - so I ignore that end.

My understanding is that linear filter is "correct" i.e. you can keep looping a linear phase ADC/DAC and the signal stays the same. But it seems I prefer a minimum phase filter at the end before I listen! I've got some ideas why. But I'm not the most qualified to say why, and neither do I have to know why to prefer it.

It may be for some recordings the "perfect" filter would be linear phase and for others minimum phase. I'm finding the intermediate phase "good" overall.

Similarly I use a filter flat to 19kHz and then a smooth roll off to 22kHz without any imaging. Again what is "perfect" may depend on the recording, but I find this "good" overall.
 
  • Like
Reactions: DNM
TBH it should depend on both how the recording was made and the overall response of your replay system, room, etc. The responses thoughout the chain combine. So you are just experimenting with one factor when you change the reconstruction filter.

In theory a 'sinc' (phase aligned) filter would be 'best' because it avoids altering either the relative phases or relative amplitudes so is 'blameless'. But if something else in the chain (inc the recording process) isn't flat and linear or your ears prefer something else...
 
  • Like
Reactions: DNM
In theory a 'sinc' (phase aligned) filter would be 'best' because it avoids altering either the relative phases or relative amplitudes so is 'blameless'. But if something else in the chain (inc the recording process) isn't flat and linear or your ears prefer something else...

I do wonder if there's often a subjective preference for Minimum Phase due to non-linearities elsewhere in systems, especially in speaker crossovers and speakers interaction with the room, manifesting itself in the so called digital 'glare'.

This perceived 'glare' might also explain why some prefer the distortions caused by MQA and why some like NOS Dac's, sticking valves in the signal path or prefer vinyl reply.

Notwithstanding poor recordings, my own subjective preference has emphatically come out in favour of linear phase the greater the phase linearity I've achieved in my speakers. Minimum phase breaks the timbral accuracy of instruments IMO, especially noticeable with drums.

In saying that, I might occasionally use a minimum phase filter listening to the likes of 'Kind of Blue' at low levels late in the evening whilst sipping a sticky. Seems to add some atmosphere and take the sibilant edginess off of Miles's trumpet!
 
The distinction wrt MQA is choice! :)

As things stand, each user can choose the DAC they prefer, and in many cases then choose which of that DAC's filters they prefer - and change this from one recording or time to another as they please. That strikes me as ideal as each user can do as they prefer.
 
I agree, MQA seeks to remove choice for commercial gain whilst reducing the fidelity we can already achieve with regular CD and Hires. However, as with minimum phase filters, which MQA also employe's in compatible DAC's, some people genuinely prefer it.

I'm simply throwing it out there that it may be due to it's distortion effects masking the non-linearities that digital exposes in our systems.
 
Just going back to the original topic - noise shaping - is there a possible overlooked issue with this - a fluctuating noise floor?

I remember ESS chief designer Martin Mallinson giving a presentation in which he spoke about the DS modulators producing non-periodic steady state noise artifacts - see here

Apparently, this DS modulator behaviour can be seen when noise Vs DC offset is plotted

Is this modulator separate to the noise-shaping function or part of it?
 
The original post has a basic flaw that i'm not sure has been pointed out. Noise shaping can't shift the noise around on an existing digital recording, as there is no 'signal' and 'noise' which can be separated out and processed separately. It's similar to aliasing noise which once baked into a signal is there for life. If I can shift the noise about, i can do better than that, and just remove it.

Noise shaping works when you have a signal with more resolution than a target representation, so for example, if I have a 24 bit recording, and I want to make a 16 bit wav out of it. When writing output samples, the error terms (the difference between the 16 bit value and the 24 bit input is 'fed back in' to the next sample to be output. At the simplest method, you just add the error term to the next sample but much more complicated methods exist (2nd order, 3rd order etc with the error affecting multiple samples in complex ways).

However, for 16 bit input to 16 bit output or whatever, there is no error to propagate, as the input can fit in the output without error.
 
This article is a readable explanation of how DS works and what can go wrong
http://www2.ing.unipi.it/~a008309/m...ofondimenti/Understanding_sigma_delta_CUT.pdf

and this paper talks about idle tones
https://www.eecs.qmul.ac.uk/~josh/documents/2007/PerezReiss-AES122.pdf

Just to add: In principle any 'resampling' process may give rise to idler tones or patterning if it uses a process of a higher order and isn't fully and 'sufficiently randomly' dithered. So this is a general problem. One which can affect DSD in particular.
 
The original post has a basic flaw that i'm not sure has been pointed out. Noise shaping can't shift the noise around on an existing digital recording, as there is no 'signal' and 'noise' which can be separated out and processed separately.

Noise shaping works when you have a signal with more resolution than a target representation, so for example, if I have a 24 bit recording, and I want to make a 16 bit wav out of it.

However, for 16 bit input to 16 bit output or whatever, there is no error to propagate, as the input can fit in the output without error.

Agree that it has to treat recorded noise as a part of the signal. But it isn't a flaw in the intended context

Downsampling from 24 bit to 16 bit of 96k material was the context of the original posting. The aim is that the result has much the same audible noise floor level as the 24 bit. For the simple 1st order example you get around the 20 bit level. Higher order shaping should do a bit better in audible terms by shoving the '16 bit quantisation level' noise more efficiently up into the 'ultrasonic'.

The key point is that: Most of the 96k/24 recordings I've seen have a noise level that is much more like being at the 20bit level or higher than being down at 24bit. Thus the noise shaping 'does no harm' because the shaped results at 16bits has an audible noise floor that isn't higher. *Because* the source noise level is audibly as high or higher.

Put it this way around. The noise shaping reduces the size of the file when it is flacced because it discards over-specification of noise. The *total* noise level for 16bit can't be below that set for dithered 16bit. But given the high sample rate you can shove most of that process noise up into the 'ultrasonic' region. Thus you can *avoid* causing the audible noise floor to *rise* to the plain 16bit level.


So yes, it will convey the noise floor in the source material. But not make it significantly higher. In these practical cases the 'flaw' is actually an advantage because it tends to mean the audible noise isn't altered because the 'source' noise sticks up above the process noise. :) But it isn't a magic wand.
 
Jim, I now see what you were intending, and it now makes sense to me. So the intention is to apply noise shaping how it is done in, say, a DAW when you have 24 bit recordings and you want to export to 16 bit WAV, and have noise shaping applied to attempt to retain more of the 24 bits of recording. This is well understood, and certainly works, and give as you say, a number of bits more apparent resolution.

As you say, 24 bit recordings are not really 24 bits of signal, since the physical limits means that you'll just be recording thermal noise for the bottom bits, and something closer to 20 bits is likely to be useful signal. That is my understanding, although i've never actually tried this in practice (say, recording a sine -100db and trying to pull it back out of the noise floor and see what's left). I don't have anything like cutting edge ADCs to hand to see what happens, but it would be instructive to try I guess.

So basically your idea is to resample 24/96 as 16/96 with suitable noise dithering, then to losslessly compress this, and hopefully get a give or take 20/96 recording into much less space. Seems a good pragmatic solution.

I'm reminded of NICAM which was 14/32, but in addition, was companded to 10 bits for transmission and noone really used to complain about it.

My experience with lower sample bit count was from DSP based synths, and various models from the 80s sounded really clean but were running low bit depth by todays standard. The Akai S900 was a 12 bit sampler, for example, whilst the DX7 ran 8 bits internally and applied a log scale to bring this back up to a whole 12 bits on the rather noisy outputs. Both sounded great in their day, and appear all over the place on recordings from the era. I think worrying about 24 bit playback for recordings from that era is a bit of a waste.
 
I'm reminded of NICAM which was 14/32, but in addition, was companded to 10 bits for transmission and noone really used to complain about it.

I've found it amusing when someone says they prefer FM radio "because it is analogue". Shows how well the BBC NICAM has worked over the the years, despight being 'sub Audio CD' in technical terms. BTW this outlines the early history of NICAM for BBC radio...

http://www.audiomisc.co.uk/BBC/PCMandNICAM/History.html

By luck, I actually managed to record the 'encore' from the China concert. It was one of the cheerful items of the era with a title lilke "Bearing the Good News from the Hills"... i.e. Maoism. 8-] Jolly nice as a party piece in more ways that one. :)
 
As you say, 24 bit recordings are not really 24 bits of signal, since the physical limits means that you'll just be recording thermal noise for the bottom bits, and something closer to 20 bits is likely to be useful signal. That is my understanding, although i've never actually tried this in practice (say, recording a sine -100db and trying to pull it back out of the noise floor and see what's left). I don't have anything like cutting edge ADCs to hand to see what happens, but it would be instructive to try I guess.
The effect of noise shaping can be demonstrated and I did this a few years ago as an exercise (in a computer program in C). At least to the extent of generating a low-level signal in double precision floating point (rather than recording it). And then quantizing the signal to 16 bits with high-order noise shaped dither.

I found I could audibly reproduce a sine wave generated at -120 dB WRT full scale (nominally 20 bits) from the 16 bit noise-shaped signal burned onto a CD (with the volume control turned right up).

I think Sony's Super Bit Mapping and DG's Original Image Bit Processing are examples of high order noise shaping applied in practice.
 
The effect of noise shaping can be demonstrated and I did this a few years ago as an exercise (in a computer program in C). At least to the extent of generating a low-level signal in double precision floating point (rather than recording it). And then quantizing the signal to 16 bits with high-order noise shaped dither.

I found I could audibly reproduce a sine wave generated at -120 dB WRT full scale (nominally 20 bits) from the 16 bit noise-shaped signal burned onto a CD (with the volume control turned right up).

I think Sony's Super Bit Mapping and DG's Original Image Bit Processing are examples of high order noise shaping applied in practice.

Sony's SBM was, but I don't know about the DG system.

FWIW one of the postings here lists a NS set of coefficients that may be better than my simplistic 1st order,

https://hydrogenaud.io/index.php/topic,47589.0.html

but as yet I've not had a chance to try it.
 


advertisement


Back
Top