The Real Measurements of the Digital Audio | part one


In the digital era, everything is science of the numbers.

You have surely heard about bits and mathematic processing that form our favourite toys: mobile phones, computers, digital TV sets, portable audio players…

Commercially speaking, the beginning dates back to August 1982 with the arrival of the Compact Disc. Officially, it came into world thanks to a joint venture between Philips and Sony that follows the withdrawal of the first Philips’ partner, DuPont, pursuant to a "clashing interest".

The CD was realized by applying diverse technological achievements - some of which even from the 60’s - but, mostly, thanks to the techniques of sampling and digitalization.

An analogue recording consists in the reading of a signal that changes in analogy to a physical parameter. What our ear-brain system recognises as music corresponds, in physics, to the mechanical waves that compress and decompress the air. When these waves move a diaphragm connected to a system that converts the movement in an electrical signal, we get a microphone. Similarly, the loudspeaker converts an electrical signal in a motion of the cone. Then, this last one reproduces again the mechanical waves that compress and decompress the air and we obtain the sounds.

The recording of the signal variations can be analogue, as happens for the cassettes or the vinyl grooves, or can be digital. The analogue signal, since it varies in time continuously, can be "sampled": that means that a sample can be recorded at regular intervals of time in order to convert the analogue signal in a discrete signal whose bandwidth is recorded. This overall operation is the "quantization”. The digits that define a signal can be recorded using the binary numeral system. In electronics is more simple to realize elements with two possible stages (on/off; +5 volt/0 volt) rather than with ten different stages, also for the coincidence with the logical values of true and false.

Besides the sampling frequency, that intuitively shows a precision proportional to the frequency, we also have to pay attention to the "precision" of the samples. The binary digit defines a logical state as true/false and this state can be represented by a physical state, for example a circuit. If we collect sequences of bits, these can assume values in wider intervals: one sample whit 1 bit of "depth" can assume only the values zero or one. If the depth is 8 bits, that is 28 = 256 possible values (from 0 to 255).

In 1958 came up the idea of exploring the possibility of memorizing, through a laser beam, analogue and digital data on optical supports.

The Laser Disc went beyond the prototype phase: it consisted of a vinyl-size support where it was possible to record analogue video tracks and/or analogue audio tracks, like the ones used in the analogue TV sets, or digital audio tracks. But the demand of the market was towards a new audio support: reliable, easy to carry, resistant, economic, copy protected and capable of collecting the royalties. At that time, in fact, it was possible to reverse the LPs on the Compact Cassettes, another invention made by Philips.

The first problem that Philips and Sony encountered was the reduced dimension of the support. Walkman and car audio devices were already widespread and the technology was prospecting hand portable CD systems. Furthermore, there was the need, for the digital, to guarantee a higher audio quality without the faults of the vinyl: the new support had to be scratch-proof and dust-senseless.

Bell made pioneer studies about digital that concerned the sound intelligibility in the telephony field. An acceptable comprehension of the human voice had to be assured. The application of recording digital techniques to the recording and musical reproduction brought to test the calculation of the sampling frequency and the "necessary and sufficient" bit depth. The standard sampling rate for digital music in the music industry turned out to be 44.2 kHz and 16 bits of depth. Some theorem definitions describe this process as making a "perfect" recreation of the signal.

When Philips commercialized its first CD player, the legendary model CD 101, I was twenty-one. I listened to it and I petrified: the sound was incredible, shrill and detailed, without the noise floor of the vinyl and in the silences, I felt alone... Something was strange, though. The sound was weird, cold, uninvolving. Sometimes it was like listening to a magic box.

In 1986, I received the CD version of The Dark Side of the Moon by Pink Floyd. A version not for sale but only for the company’s internal use. I can picture it: there was a Thorens TD-160 MKII turntable with the legendary Grado GT super cartridge, Sony CDP-701 ES CD player, Luxman M-05/C-05 preamp and power amp, Bower & Wilkins 802 loudspeakers. With all the precautions - maybe it could be a remastering - I try to make a comparison vinyl/CD. I start with the vinyl: just after the simulated heartbeat, there is the instrumental attack and the cymbals of the drums. Here starts my heartbeat, a great emotion… I switch to the CD, same levels on the VU meters, and the heartbeat starts… Not bad, I think… Beginning of the instrumental part and…disaster! Here are the notorious metallic and screeching highs, no involvement and such an annoyance that I have to switch the system off.

All right: it took us years but we have learned how to make better converters than those ones of the ‘80s… But what was the reason that made the "forever perfect" sound - the slogan of those years - only a lie?


In electronic, Nyquist–Shannon's sampling theorem is at the basis of the signal theory and puts into relationship the content of the sampling signal with the sampling frequency. It defines in such way the minimum frequency required to sample an analogue signal without any loss of information. In essence, the theorem says that, on certain presumptions, a bandlimited analogue signal that has been sampled can be perfectly reconstructed from an infinite sequence of samples, if the sampling rate is more than twice the maximum frequency of the signal being sampled. But what does it mean "on certain presumptions”?

It means that Nyquist–Shannon's theorem is true only if the signals are sinusoidal or ascribable to a system with complex sinusoidal signals.


The Analogue-to-Digital systems transform the analogue info into a sequence of bits. Better: the system reads the value of a signal over and over in the time unit at predefined intervals. As an example, you can imagine a camera recording a ballet. The camera memorizes the images as a sequence of single frames. If we look at the sequence, frame by frame, we realise that, between one image and the other, some time has passed, and the position of the dancers has changed. If in the first image we see the right arm of the dancer raised up and in the second pulled down, we can figure the movement. But if the dancer, between one image and the other, waves his girlfriend, we cannot realize that: a piece of info is missing. This happens because, between one image and the other, the time goes by without recording that piece of info.


The pure sound waves (sinusoids) are measured in Hertz (Hz) and are the number of occurrences between a push of pressure and a vacuum of pressure. Do you remember at school, when we usually put the ruler on the corner of the desk and made it vibrate? The frequency is the number of occurrences of a repeated event per second, measured in Hertz (Hz).

We officially listen with our auditory apparatus frequencies from 20 Hz to 20,000 Hz. This sensitivity, however, varies as the age changes.

But there is also another very important parameter: the minimum temporal difference between the right ear and the left ear, which is 20 microseconds, 20 millionth of a second. This parameter is directly referable to the localization of the signals.


Everything described so far can demonstrate if our audio system is faithful in reproducing all the waves that overlap themselves during the listening of music.


The musical signals are the sum of different signals (sinusoids, impulses, etc.) which sum up or subtract themselves in a definite time. Imagine the sea and its waves. The waves are a good approximation of the single sinusoidal signals at low frequency. If a motorboat runs on the waves, the waves become closer. For us, this is like a high frequency signal. Now in the water there are new waves that are the sum of the sea waves plus the impact of the motorboat with the water.

Therefore, we will have two kinds of waves: those more distant from the motorboat are the sum of the sinusoid of the seawater and of the motorboat, while the closest will also have the crests formed by the impact of the motorboat fore with the water. The crests will be very high, steep and very fast. The sounds we listen to are also made of several components. In the specific, the hit of the hammer on a piano, the flow of the bow on the violin, the distortion of the electric guitar, the rap on the hammer rim or the clash of the colliding cymbals, create an initial part of the sound with steep crests and then a sum of almost sinusoidal waves. Our ear perceives the initial parts of these sounds with particular sensitivity and this initial part of the sound is the "formant part".

This is the most difficult part to reproduce, both with digital and analogue systems and is strictly connected to the frequency response. But, whereas with the analogue systems it is quite simple to get a wider frequency response (the most important parameter even not the sole), with the digital system things get complicated. In fact it is possible to reproduce complex signals only by increasing the quantity of the info at our disposal and by carrying them to the limit of the auditory characteristics.


Hereafter, how to do the measurements:



Below, in green, the impulse generated by the signals generator; in yellow the impulse generated by the DAC:


Hereafter, the images - read with an oscilloscope - of the signals at the generator output and at the output of the AD-DA system, with a 44.1 KHz conversion and a 1 KHz sinusoidal signal at 0 dB. In green, the images of the generator of signals and in yellow the converter output.


Hereafter, the images - read with an oscilloscope - of the signals at the generator output and at the output of the AD-DA system, with a 44.1 KHz conversion and a 1 KHz triangular signal at 600 mV 0 dB.


Hereafter, the images - read with an oscilloscope - of the signals at the generator output and at the output of the AD-DA system, with a 44.1 KHz conversion and a 1 KHz square wave signal at 600 mV 0 dB.


Normally, this is sufficient for the most part of the audiophiles. The signals are very similar except for a slight undulation of the square wave. It is a distortion with a great influence on the sound but you can detach it only in some instruments. We will see this later.


Hereafter, the images - read with an oscilloscope - of the signals at the generator output and at the output of the AD-DA system, with a 44.1 KHz conversion and a square wave, saw-toothed, triangular sinusoidal signal, at a 20 Hz frequency:





Images are clear, perfect. Every single wave is reproduced in a correct and precise way. There are neither hesitations, nor awful undulations.

In reality, the truth is something different. In the images of the next article you will see how the paradoxical behaviour of a 5,000 Hz converter, that means in full audibility…


To be continued

by Alberto Pepe
Read more articles

Torna su


DiDiT banner
KingSound banner
Omega Audio Concepts banner
Vermöuth Audio banner

Is this article available only in such a language?

Subscribe to our newsletter to receive more articles in your language!


Questo articolo esiste solo in questa lingua?

Iscriviti alla newsletter per ricevere gli articoli nella tua lingua!


Sign up now!