advertisement


Plot Spectrum feature in Audacity - question about Function and FFT

ToTo Man

the band not the dog
I'm never quite sure what window and FFT I should be using when examining the frequency spectra of audio samples in Audacity. I realise it's a technically complex topic of which I have very limited understanding of, so detailed explanations are likely to be lost on me. All I really need to know is what Window and FFT I should use in the following scenarios:

1) White Periodic Noise:
If I export a sample of White Periodic Noise with FFT of 65536 from REW, it looks smoothest when viewed with a Rectangular window and FFT of 65536, with these settings it displays as a perfectly flat line with zero scribbling.
(When I output this signal to my DAC route it back into my Focusrite 4i4, the line becomes slightly scribbled and is no longer perfectly flat. The Welch window displays a slightly flatter line than the Rectangular, but I assume I should still use the Rectangular window?)

2) Pink Periodic Noise:
If I generate a sample of Pink Periodic Noise with FFT of 65536 in REW then analyse it in Audacity, it looks equally smooth when viewed with Rectangular or Welch windows with FFT of 65536. The other Window functions introduce scribbling.

3) Log and linear sine wave sweeps:
If I export a linear or log sine wave sweep from REW, no combination of Window and FFT in Audacity displays as a smooth or flat lines throughout the entire frequency range, some are 'sawtooth' in appearance and others have square 'steps'.

4) If I want to analyse the noise floor of an analogue input, what Window and FFT is most appropriate?

5) If I want to analyse the frequency content of music, what Window and FFT provides the most accurate representation?
 
Last edited:
According to the Audacity manual:

Spectrum:
Plots the FFT of the data as described above. The amplitudes are normalized such that a 0 dB sine (pure tone) will be (approximately) 0 dB on the graph.

(FFT) Size


This controls how many frequency divisions are used for the spectrum, or how many samples are used for autocorrelation. In Spectrum, a larger size gives more accurate frequency resolution (narrow bands), but averages the result over a longer period of time (because more samples are needed for the calculation).


Function:
Function offers choices like Rectangular, Hann, Hamming and others. We suggest you use the default Hann for most situations. The fundamental principle at work here is that the way we observe our data changes what we see. The "true spectrum" of your project would be computed over the entire project and would provide very detailed frequency resolution but essentially no time resolution at all. In other words, this "true spectrum" would offer an average frequency distribution over the entire project. If we select a short interval of audio, the short-time spectrum has frequency resolution limited by the observation window time AND the result is affected by the spectrum of the window itself. For general audio analysis, the Rectangular window is least desirable, and the other options offer slightly different effects.

So, the higher the FFT the better w.r.t. frequency resolution accuracy, and the Hann function should be used for general audio analysis, which I assume means music. It does not however answer my questions as to what Functions should be used for white periodic noise, pink periodic noise, and noise floor analyses. Paging @John Phillips and/or @Jim Audiomisc. :)
 
Last edited:
The larger the number of samples per FFT the higher the frequency resolution. But the loss is that you are 'averaging' over a longer time-span. So you trade those off as suits your situation.

When I'm doing an averaged spectrum of all the content of an audio file I tend to choose 8k or 16k samples per FFT and then average the spectra for successive chunks along the duration of the file.

The main thing to keep in mind when using noise as a probe is that the 'bin width' for the levels at 'each frequency' depends on the number of samples/FFT as well as the sample rate. So a given noise level will show up as being 'higher' if you use fewer samples per FFT as it is dividing the same total noise into fewer 'bins'. Hence having more samples also tends to drop the *measurement* noise floor. Hence to see something like distortion products at low level, choose a high number of samples per FFT to drop the noise which might otherwise cover them up.

FWIW Personally I prefer Blackman-Harris windowing as it gives the neatest results for distinct sinewave components.
 
Re the window function element of the question, I am only somewhat familiar with choosing windowing functions for FIR filter coefficients (and that was learned on the job so probably my knowledge is not well grounded theoretically). However I will offer an opinion.

No window is perfect for all applications. For doing a FFT, window functions all offer their own particular trade-off between frequency resolution (separating frequency components that are very close to each other) and dynamic range (being able to see small signals at different frequencies from that of a large signal).

I suspect (without any real experience of this requirement, though) there will not be a vast difference in practice between common window functions: rectangular, Welch [1], Hamming, Hann and Blackman-Harris (but I might be wrong). AIUI, in that order you start off with good frequency resolution but with compromised dynamic range (rectangular), moving along to increasingly poorer frequency resolution but increasingly better dynamic range (Blackman-Harris). The Hann default suggestion in the manual looks good. But equally for this application the Welch and Blackman-Harris may work better. I don't know Audacity's FFT function at all so I am speculating.

If I were experimenting to find out which to use (and that's all I can offer in the absence of practical knowledge) I would try Welch, Hann and Blackman-Harris with some selected test signal and see how well the features I wanted to see were presented or not. Sorry I can't offer more.

[1] I didn't know this one but I looked up its characteristics.
 
One further thought. On the OP's q5, music does not have that much dynamic range when thinking relative to windowing for a FFT. So using a window with lots of dynamic range to analyse music's spectral content (or any signal with low dynamic range) will only sacrifice frequency resolution unnecessarily for dynamic range that isn't needed.

The simple rectangular window may well be good enough for what I think the OP wants to achieve. But the Welch window looks like it has a very good compromise with some extra and maybe useful dynamic range with only a small sacrifice in frequency resolution. That end of the "spectrum" of windows functions looks better matched to what I think the OP wants in q5 than Dolph-Chebyshev which has excellent dynamic range for analysing unspecified audio signals but more than I think is necessary for music and hence sacrifices frequency resolution. So that end of the FFT window "spectrum" suits the OP's q4 instead.

EDIT: On the windowing web page above, the "Fourier transform" images, I think you look for a window with a narrow peak if you want lots of frequency resolution and look for low side-lobes if you want lots of dynamic range. The collection of images gives some idea of how these are traded off.

But beware of my opinion and my theoretical grounding in this topic is limited.
 
Thanks all. Once Flickr restores my missing photo allowance I'll upload some graphs to show how the spectra changes depending on the Function that's chosen.
 
First up is Audacity's analysis of a 60 second, 44.1kHz 24-bit 65536 FFT White Period Noise signal exported from REW, using a Linear axis:

51355890330_ef2666e527_o.jpg
 
Last edited:
Audacity's analysis of a 60 second, 44.1kHz 24-bit 65536 FFT Pink Period Noise signal exported from REW, using a Linear axis:

51354942121_a906520e32_o.jpg
 
Last edited:
I'll leave the sine wave sweeps for the moment and skip ahead to Audacity's analysis of a 10 second, 44.1kHz 24-bit sample of the noise floor from an analogue input (REC out of my Yamaha A-S3000 amplifier into the line in of a Focusrite 4i4 ADC). This time I've chosen a Log axis so that the bass frequencies are easier to distinguish.

Note the signal has been amplified by +75dB in Audacity to make the noise spectrum more visible:

51356212989_2ea486569c_o.jpg
 
Last edited:
The first 237.8 seconds of a 16-bit 44.1kHz music track in linear and log presentations:

Linear:
51356604711_24df6e2880_o.jpg


Log:
51357619200_200833a601_o.jpg
 
In general I think it's as I expected. It looks to me that there's not enough difference between different windows to worry too much. Except perhaps for the noise floor, which I can't explain, but may be less obvious if using a log frequency scale. However my knowledge of the detail of how FFTs behave is even more limited than my (fairly superficial) knowledge of windowing functions.
 
In general I think it's as I expected. It looks to me that there's not enough difference between different windows to worry too much. Except perhaps for the noise floor, which I can't explain, but may be less obvious if using a log frequency scale. However my knowledge of the detail of how FFTs behave is even more limited than my (fairly superficial) knowledge of windowing functions.
Yes, if I'm comparing samples with identical audio content but from different sources to see how the frequency response has changed, e.g. comparing the original digital master to one that's been recording out to tape, then I suppose it doesn't matter what Window is chosen as long as it's the same for both files.

For White Periodic Noise, the Rectangular window provides the neatest, squiggle-free trace, closely followed by the Welch window. This makes visual comparisons to the untrained eye easier.

For Pink Periodic Noise, all Windows are pretty much of a muchness, the Blackman and Gaussian functions are slightly squigglier than the others.

My OCD does make me want to know which of the Windows provides the truest info on low frequencies in the Log scales, as there is quite a bit of difference in the level of subsonics across the different Windows, this is most evident in the noise floor graphs in post #12.
 
I always plot points or lines when doing spectra. Not the coloured-in block that Audacity tends to show.

But then I'm not using Audacity. At LF the shapes will vary with the windowing because on the log plots you've expanded the scale at LF. If you want to see detail you'd need to replot just the LF part and also show/know the shape from a single sinwave spot-on one spectral frequency. I tend to go for BH as it seems to me to 'compartmentalise' the components fairly distinctly. But when in doubt go for more resolution (i.e. more points per FFT).

The basic point here is that a 'sinewave' with a specific 'frequency' is implicitly of infinite duration. By using finite durations and non-sinewave input some spread in response is inevitable from plain FFT methods.
 


advertisement


Back
Top