Reflections on High Resolution Audio
by dolomuse | Nov 11, 2017 | technology |
Many arguments over high-res audio ‘hype’ or ‘scams’ seem to ignore the larger picture of the inevitable inconsistencies within our current, disjointed transition period. It seems that we are in a phase of clunky transition, with partial hi-res implementation ‘bolt-on’s’ being applied to a standard-res framework.
The issue seems much more expansive, especially when seen in context; as isolated developments in an eventually integrated high-resolution audio paradigm – a process which also leads to further refinements in our conventional understanding of auditory perception (and vice-versa).
Our current understanding of sound and hearing is built upon a conceptual framework of discrete sine waves (the acoustic representation of a rudimentary trigonometric function). So, when conventional knowledge states that the range of human auditory perception is generally limited to 20 Hz – 20 KHz, this refers to testing based on the perception of measurable sine waves (isolated, single frequency waveforms without harmonics). The influences of sonic and ultrasonic harmonic interactions in real world sounds are irrelevant to this paradigm. On the basis of this type of perceptual research, most audio equipment is designed with a frequency response that excludes all ultrasonic harmonics and any (indirectly audible) integrated interactions with (directly audible) harmonics.
With the early development of ‘high-resolution’ sampling rates of 192 KHz and beyond, recognition of the upstream/downstream frequency-bandwidth limitations has been generally ignored.
Presently, the potentials of high-resolution audio are implemented more in concept than content. The Nyquist frequency theorem dictates that a sampling rate of 192 KHz can accurately digitize acoustic frequencies (harmonics) up to 96 KHz, yet this theoretical capacity remains unimplemented when applied to incoming signals which are transduced at a frequency-bandwidth of less than 20 KHz (or an output transduced in standard-resolution speaker systems).
The result of these upstream/downstream transduction limitations of current audio technology is that ‘hi-res’ sampling rates merely render the measurement of more samples per second of limited frequency-bandwidth (standard-resolution) signals.
These standard-resolution ‘links’ in an emerging high-resolution ‘chain’ seem to be the basis of A/B testing results where audible differences between standard and high-resolution audio are seen/heard as negligible to nonexistent.
One compelling acoustic example for me is the audible difference between the sound quality composite of a live gong player in comparison to a standard-resolution (44.1 KHz, 16-bit) recording of it.
Thinking in terms of complex tones, the 2nd harmonic is an octave above the fundamental pitch. So, a sampling rate of 20 KHz would only capture up to the 2nd harmonic (1st overtone) of a 10 KHz tone (and up to the 4th harmonic of a 5 KHz tone, etc). This would seem to infer that many of the more distant harmonics (and their interactions) would be lost even at the proposed HRA frequency bandwidth standard of 40 KHz – which could theoretically transduce only up to the 4th harmonic of a 10 KHz tone, and the 8th harmonic of a 5 KHz tone.
Further, the auditory perceptual impact of sampling rate on the transduction of inharmonics (non-integer multiple harmonics; the pitch-space between harmonics) is a compelling area for exploration.
http://www.realhd-audio.com/?p=4583
https://medium.com/…/truth-lies-and-fraud-in-the-audiophile…
I would be interested in having a discussion on “higher resolution” audio – I was introduced to the idea of the disparity between electronic and acoustic frequency response by Rupert Neve – 30+ years ago and have been experimenting with it for many years – actually implementing wide electronic bandwidth systems to try and retain the rising edges of transients – and how they relate to our perception of distance – You are completely correct that we are missing the completion of a different level of reproduction. There is a community of folks who try to capture and retain higher actual bandwidths – My goal has been to keep a minimum electronic bandwidth of 8 Hz to 60 kHz but 100 kHz would be more appropriate. With more “immersive” formats coming out, the positional accuracy of sounds seems to depend on some of these clues. My focus has been more in the time domain than the pitch domain but they are really just two different windows in the same room. Outside of audiophile thinking, there have been designers and manufacturers who understood this – Neve, Focusrite, B&K, Haffler, Crown – but many audio engineers and audiophiles will still argue with me that nothing beyond 20k matters, even in the electronic domain – where I know for a fact they are wrong.