namely that the files have been down-sampled, by a process we cannot know
I read the paper ages go. It is not fully clear, but it suggests that they used Pyramix for the SRC. At that time, the sample rate convertor of Pyramix was proven not-blameless.
At best this test compares 88.2 with a down-sampled version - it doesn't compare two sampling rates in isolation but also the sample rate reduction process.
Disagree. What they did is the only correct way, only the implementation leaves something to be desired.
All ADC chips sample at a high rate, and perform internal downconversion to the target rate. If you were to compare a recording at 88.2kHz with a recording of the same source and with the same ADC set to 44.1kHz, then you are in fact comparing the potential effects of the sample rate
and of the ADC's final downconversion stage.
Now ADC chips are built for economy, and just about all of them use half-band FIR anti-aliasing filters. This means that all of them violate the sampling theorem: the half-band nature, combined with the nature of music,
ensures that the band between, say, 20kHz and 22kHz is filled with aliasing components, which upon playback with a standard half-band filtered DAC will guarantee the excitation of this DAC's pre-ringing. The audibility of this can be argued, but what cannot be argued is that this is an unclean, undesirable starting point for comparing just the effects of sampling rate.
Therefore one has to take the 88.2kHz output of the ADC (or better still, 176.4k) and then use best-known software-based practices for getting to 44.1k (88.2k) while totally avoiding any aliasing.
And for replay one then has to convert all back to 88.2kHz, again with best practices (but this is easy), so as to suppress any difference originating from the DAC running in two different modes.
So what we have in this paper is a comparison between 88.2kHz native from the ADC, 44.1k native from the ADC, and 44.1k generated in a DAW, but now we know that:
1) just about all ADC chips are compromised when doing the final conversion to target rate.
2) the DAW they used was very likely compromised when doing the final conversion to target rate. (In fact most 'pro' DAW SRCs at that time ranged between suspect and utter crap.)
So the results of this paper can only be tagged 'inconclusive'. It needn't have been this way, if only the authors would have had a deeper understanding of the mechanics of digital recording and the performance parameters of the gear they employed. Alas.