Intro to Digital Audio Intro to Digital Audio
Electronic Studio Methods + Composition — Louis Goldford (2023)
These slides will discuss:
These slides will discuss:
What is Sound?
Introduction to Digital Audio
Digital Signal Processing (DSP)
These slides will discuss:
What is Sound?
  • Vibration: Simple Harmonic Motion
  • The Mass-Spring System
  • Waveform: Displacement, Velocity, Acceleration
Introduction to Digital Audio
  • Analogue vs. Digital
  • Sample Rate + Bit Depth
  • Properties: Frequency + Amplitude
Digital Signal Processing (DSP)
  • Synthesis vs. “Found Sounds”
  • Time Domain vs. Frequency Domain

What is Sound?

What is Sound? — Vibration: Simple Harmonic Motion

loy.fig.1.2.png

When you set a tuning fork into motion (a), it vibrates the air molecules around it (b).

Zones of greater or lesser density (c) produce differences in air pressure.

We experience and perceive these differences in pressure as sound.

What is Sound? — Vibration: Simple Harmonic Motion

loy.fig.1.2.png

Look closely at the graph of density (c)...

It describes a back-and-forth motion between high pressure and low pressure zones.

This basic motion is predictable and periodic, and is known as Simple Harmonic Motion.

What is Sound? — Vibration: Simple Harmonic Motion

loy.fig.1.3.png

Imagine: we tie a pen to the vibrating tuning fork, and trace its position on a scrolling paper
(like a seismograph used to record ground motion during an earthquake)...

If we zoom in using a magnifying glass, we'll notice the same back-and-forth motion
oscillating up and down on the paper.

This time, however, the motion is measuring the displacement of the tuning fork in space.

What is Sound? — Vibration: Simple Harmonic Motion

loy.fig.1.3.png

This regular back-and-forth movement between 2 points is an oscillation

and can be described as sinusoidal, meaning it takes the shape of a sine wave.

A sine wave is a mathematical abstraction of simple harmonic motion.

What is Sound? — The Mass-Spring System

loy.fig.1.4.png

We can describe the physical process of simple harmonic motion using a mass-spring system.

In (a) there is no motion: the mass is still, and it rests at a point (B) of equilibrium.

When we set the spring into motion (b), the mass oscillates up (C) and down (A).

What is Sound? — The Mass-Spring System

If we slow a down a video of a violin string several hundred times,

we can observe simple harmonic motion in the string’s vibration.

Waveform: Displacement, Velocity, Acceleration

loy.fig.1.7.png

Displacement measures the physical movement of something in space.

Waveform: Displacement, Velocity, Acceleration

loy.fig.1.7.png

If we graph the displacement of the mass in space, as we did the tuning fork (a),
we can describe (b) its speed (velocity) and its change in speed (acceleration).

When the mass crosses through its “rest” position (equilibrium), it is moving its fastest.

But when the mass reaches its highest and lowest positions, its change in speed
(acceleration) has the greatest magnitude (positive + or negative -).

Waveform: Displacement, Velocity, Acceleration

loy.fig.1.5.png

All 3 properties (Displacement, Velocity, and Acceleration) can be described
using sine waves that reveal simple harmonic motion.

Waveform: Displacement, Velocity, Acceleration

loy.fig.1.8.png

A sound wave is very similar...

Here, displacement corresponds to amplitude (loudness)
on the Y-axis, and its change over time on the X-axis.

This sound wave is a single note on a plucked string instrument:

It begins with high amplitude, when the string vibrates its loudest, with lots of energy,

but then gets quieter as time goes on, until silence arrives: when the string stops vibrating.

The string loses energy. Excited by just 1 pluck, the string’s energy dissipates
over time, as it returns to a state of equilibrium.

Introduction to Digital Audio

Digital Audio — Analogue vs. Digital

loy.fig.1.8.png

We must remember that this waveform, with its smooth curves, exists only in nature.

When we produce sound through physical means,
we obtain “ideal” smooth and continuous analogue waveforms.

Digital Audio — Analogue vs. Digital

cipriani.fig.1.10.png

We can measure the instantaneous amplitude (loudness) of a signal at any point in time,

Yet, this analogue waveform remains “ideally” smooth and continuous,

with an infinite number of points along the X-axis we could measure!

Digital Audio — Analogue vs. Digital

https://www.stereophile.com/images/522kev.attape.jpg

Here, an audio engineer works with mangetic tape and a vinyl record lathe.

These machines record analogue sound waves from electrical signals

which capture continuous physical vibrations picked up by microphones.

Digital Audio — Analogue vs. Digital

Things are very different using computers...

Digital Audio — Analogue vs. Digital

cipriani.fig.1.11.png

In digital audio, there is no smooth curve but only discrete points that represent curves.

We obtain these points by taking periodic measurements of a waveform’s instantaneous amplitude.

Each point represents the value of a signal’s amplitude at an exact point in time.

These periodic measurements of a waveform are called samples.
The original waveform above was sampled every 5 milliseconds (msec).

Digital Audio — Analogue vs. Digital

cipriani.fig.1.11a.interp.png

To reconstruct the analogue signal, a computer can insert values between the sampled values.

This process of inserting new data between points is called interpolation.

Each black square point represents a sampled value from a signal,

and each red line segment represents additional points
inserted between the samples to improve the signal’s smoothness.

Digital Audio — Analogue vs. Digital

cipriani.fig.1.11a.interp.png

The red line segments look and sound much like the original analogue signal...

...but crucially, it will never be precise...

A digital signal can only approximate an original analogue sound wave.

Digital Audio — Analogue vs. Digital

https://digitalsoundandmusic.com/wp-content/uploads/2014/05/Figure-5.20-Signal-path-in-digital-audio-recording-1024x402.png

To obtain a digital representation of an audio signal, for example, on your computer,

you will need something to convert between analogue and digital formats.

This is accomplished using an Audio Interface, which connects microphones,
speakers, and other analogue equipment producing a physical vibration

to a computer and to other digital signal processing equipment.

Digital Audio — Analogue vs. Digital

https://digitalsoundandmusic.com/wp-content/uploads/2014/05/Figure-5.20-Signal-path-in-digital-audio-recording-1024x402.png

Input signals are converted: analog-to-digital conversion (ADC)

in order to record, edit, synthesize, and process sound on a computer,

Digital Audio — Analogue vs. Digital

If we disassemble a microphone, we will find a diaphragm...

...which physically vibrates in response to sound,
much like the diaphragms in our ears.

Microphones convert physical energy into electrical signals,
and your ADC converts this into a digital signal.

Digital Audio — Analogue vs. Digital

https://digitalsoundandmusic.com/wp-content/uploads/2014/05/Figure-5.20-Signal-path-in-digital-audio-recording-1024x402.png

Output signals are also converted: digital-to-analogue conversion (DAC),

in order to amplify, broadcast, and send sound and audio data from a computer.

Digital Audio — Analogue vs. Digital

If we slow a down a video of a speaker several hundred times,

we will see the speaker cone’s physical vibration.

Your DAC converts digital signals into electrical signals,
and your speakers convert this into physical energy.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

In digital audio, the sound quality largely depends
on how often we take samples of an input signal.

How often we sample an audio signal is called
the sampling rate — or sampling frequency.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

As a rate, the sampling rate measures how many samples are taken per second.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

In the example above, A. utilizes a very low sampling rate.
There is more time taken between each sample value.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

B. and C. utilize higher sampling rates, which encode increasingly
higher frequencies from the original signal into digital format.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

Lower sampling rates decrease the quality of the audio
because they retain fewer components from the original sound.

However, they take up less memory in our computers.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

Higher sampling rates increase the quality of the audio
because they retain more components from the original sound.

However, they take up more memory in our computers.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

So, in choosing a sampling rate, there is a tradeoff
between audio quality and computational memory.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

In digital audio, the standard sampling rate is 44.1 kHz (killohertz)

Meaning amplitudes are sampled 44,100 times per second!

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

44,100 is the sampling rate you will find on compact discs (CDs)

and is appropriate for listening, playback, and consumer audio applications.

Digital Audio — Sample Rate + Bit Depth

https://www.matc.edu/course-catalog/creative-arts-design-media/audio-production.html

But for audio recording, editing, and production,

a higher sampling rate is necessary in order to hear detail.

Production sampling rates of 48 kHz and 96 kHz are common.

In our MHL Digitale Kreation classes, 48 kHz is our standard
sampling rate for production and final playback formats.

Digital Audio — Sample Rate + Bit Depth

https://www.masteringbox.com/bit-depth/

Another measure of audio quality is its bit depth

which measures the dynamic range of an audio signal
(its total loudness and its resolution of loudness).

Digital Audio — Sample Rate + Bit Depth

https://www.masteringbox.com/bit-depth/

We want our musical recordings to have a wide dynamic range.

The louder and softer our recordings are,

the more expressive and dynamic our music will be!

Digital Audio — Sample Rate + Bit Depth

https://www.masteringbox.com/bit-depth/

Notice the difference in the number of steps, along the Y-axis,
on the left and right sides of the image above...

The left side has a lower bit depth, while the right side has a higher bit depth,

meaning the right side has more points of resolution
to represent the original amplitude of an audio signal.

Digital Audio — Sample Rate + Bit Depth

https://www.soundguys.com/audio-bit-depth-explained-23706/

Here, the analogue input signal (from a microphone) is in blue,
and its digital representation is in orange.

The sample values are quantized at 3 different bit depths:
2 bits, 4 bits, and 8 bits (left to right).

Notice that the shape of the original signal improves as bit depth increases...

Digital Audio — Sample Rate + Bit Depth

https://www.soundguys.com/audio-bit-depth-explained-23706/

Also, notice the quantization error / noise graphs below (in blue)

representing the difference between the digital signal and the analogue input signal.

This difference is the error between the original and digital signals.

Notice there is less noise as bit depth increases.

Digital Audio — Sample Rate + Bit Depth

Lindos10.svg

Dynamic range is the difference between the loudest and softest sounds in our signal.

When we record and edit audio using higher bit depths,
we have more digital numbers available
to represent the full dynamic range of our music.

At lower bit depths, quieter sounds are masked by error noise,
and louder sounds clip and distort.

Digital Audio — Sample Rate + Bit Depth

Lindos10.svg

In this image, the green bars represent the dynamic range.

The dynamic range is wider for higher bit depths (like 24 instead of 16):
an increased range of numbers to represents the loud and quiet parts of our music.

Notice the dark green bars too: this represent the noise floor,
meaning the quietest sounds which become digital error noise.

Digital Audio — Sample Rate + Bit Depth

cipriani.fig.1.12.png

Here are 2 waveforms.

The upper waveform is softer or quieter:

It only reaches an amplitude of +0.5 / -0.5

The lower waveform is louder:

It reaches a maximum amplitude of +1 / -1

Digital Audio — Sample Rate + Bit Depth

cipriani.fig.1.13.png

Sounds that are too loud for an audio system to handle are clipped.

Notice how the top and bottom of this waveform becomes flat
at +1 and -1, creating hard edges instead of smooth curves.

Above +1 and below -1, the waveform should continue making smooth, rounded curves.

But digital audio can only represent sampled amplitude values between +1 and -1.

So, sounds that are louder than this will clip and sound distorted.

Digital Audio — Sample Rate + Bit Depth

Here are some examples of the same sound recorded at a different bit depths.

At lower bit depths, listen for distored audio signals.

Digital Audio — Sample Rate + Bit Depth

Ideal Bit Depth

CD quality audio utilizes a bit depth of 16 bits.

16 bits is a reasonable value for listening
and for consumer audio,

but not for audio editing and recording.

In our MHL Digitale Kreation classes, 24 bits is our standard
bit depth for production and final playback formats.

Digital Audio — Sample Rate + Bit Depth

https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html

This table illustrates the difference between sample rate and bit depth.

Sample rate corresponds to resolution on the X-axis (time),

while bit depth corresponds to resolution on the Y-axis (amplitude).

Higher values for both result in a better reproduction of a waveform.

Digital Audio — Sample Rate + Bit Depth

Review — Sample Rate + Bit Depth

The sample rate (or sampling frequency) measures the
number of times per second that an analogue input signal is sampled.

When a signal is sampled, its instantaneous amplitude is measured.

Computers can interpolate between samples by inserting new values
between sample points, making the waveform smoother.

Higher sample rates correspond to more high frequency sounds
represented from the original signal.

Lower sample rates correspond to fewer high frequency sounds,
and a less realistic reproduction of an original signal.

Digital Audio — Sample Rate + Bit Depth

Review — Sample Rate + Bit Depth

Bit Depth measures the dynamic range of recorded sound.

Higher bit depths correspond to a wider dynamic range,
representing the louder and softer components of a signal.

Lower bit depths mask quieter sounds with error noise,
and louder sounds clip and distort.

CDs are issued with a sample rate of 44.1 kHz and a bit depth of 16 bits.

Remember: these values are for listening and consumer audio,
but they are not suitable for audio production.

Digital Audio — Sample Rate + Bit Depth

Review — Sample Rate + Bit Depth

In our MHL Digitale Kreation classes,
a sampling frequency of 48 kHz and a bit depth of 24 bits
is our standard for production and final playback formats.

In your projects and assignments,
export audio files using these values.

Introduction to Digital Audio

Properties: Frequency + Amplitude

Digital Audio — Properties: Frequency + Amplitude

How do we describe sound?

In digital audio, we can easily describe
a sound’s basic physical measures:

  • Frequency
  • Amplitude

Digital Audio — Properties: Frequency + Amplitude

Frequency

Frequency is a physical measure of a sound’s
perceived quality of highness or lowness.

Frequency is measured in units of hertz (Hz), or cycles per second,

which is comparable to units like RPM (revolutions per minute),
used to describe the rotation of wheels on a vehicle.

Digital Audio — Properties: Frequency + Amplitude

Frequency

Frequency corresponds to vibration,

and describes the compression and rarefaction
of vibrating air molecules, which carry sound waves.

We humans experience, perceive, and describe frequency as pitch.

Digital Audio — Properties: Frequency + Amplitude

Frequency
wave.crest.png

Frequency corresponds to vibration,

and is measured as the distance between
the crests or troughs of a waveform.

This distance is called the wavelength (λ)
for regular and repeating (periodic) waveforms.

Digital Audio — Properties: Frequency + Amplitude

cipriani.fig.1.9.png

As frequency increases and is described using a higher value in hertz,

that is, when there are more cycles per second,

the distance between crests in the waveform decreases.

Higher frequencies therefore correspond to shorter wavelengths.

Digital Audio — Properties: Frequency + Amplitude

https://www.tutorix.com/physics/radio-waves

It turns out that all of the frequencies we hear and that our ears
are sensitive to occupy just a small range of vibration.

Everything we can hear is located in the zone VLF (“very low frequencies”).

Above this range are radio transmission frequencies,

the infared, visible light, and ultraviolet light spectra,

and the high frequencies of radiation.

Digital Audio — Properties: Frequency + Amplitude

cipriani.fig.1.9.png

We humans experience, perceive, and describe
this change in frequency as higher in pitch.

Digital Audio — Properties: Frequency + Amplitude

http://acousticslab.org/psychoacoustics/PMFiles/Module05.htm

Frequency and pitch are related, but different concepts...

Frequency is the physical measure of vibration.

Pitch is a perceptual measure of our experience of frequency.

Digital Audio — Properties: Frequency + Amplitude

http://acousticslab.org/psychoacoustics/PMFiles/Module05.htm

Pitch describes how our ears respond to changes in frequency.

The graph above illustrates our logarithmic perception of pitch:

Each time we double a frequency value (Hz), pitch is raised by an octave.

Digital Audio — Properties: Frequency + Amplitude

http://acousticslab.org/psychoacoustics/PMFiles/Module05.htm

We perceive that the distance between each octave is the same.

We therefore experience frequency on a linear scale.

But this does not correspond to the physical reality of vibration:

The distance between each octave, in frequency, changes depending on the register...

Digital Audio — Properties: Frequency + Amplitude

http://acousticslab.org/psychoacoustics/PMFiles/Module05.htm

For example, the distance between A0 and A1 is only 27.5 Hz,

but the distance between A5 and A6 is 880 Hz.

Digital Audio — Properties: Frequency + Amplitude

http://acousticslab.org/psychoacoustics/PMFiles/Module05.htm

This means that changes in higher frequencies
require greater changes in the number of cycles per second,

while changes in lower frequencies
require smaller changes in cycles per second.

Digital Audio — Properties: Frequency + Amplitude

http://acousticslab.org/psychoacoustics/PMFiles/Module05.htm

In other words, changes in higher frequencies
require more physical energy

than changes in lower frequencies,
which require less physical energy.

Digital Audio — Properties: Frequency + Amplitude

http://acousticslab.org/psychoacoustics/PMFiles/Module05.htm

Different sound sources, including acoustic instruments,
occupy different frequency ranges.

Can you imagine what this graph would look like
if it were plotted in frequency rather than pitch?

Digital Audio — Properties: Frequency + Amplitude

cipriani.fig.1.12.png

We have already mentioned amplitude...

Amplitude is a measure of the intensity of sound:

its perceived quality of loudness or softness.

Digital Audio — Properties: Frequency + Amplitude

cipriani.fig.1.12.png

When a waveform is louder, the crests do not occur more often, or more frequently...

Instead, the intensity of sound corresponds to the height of each wave;

that is, the maximum of each crest,

and the minimum of each trough.

Digital Audio — Properties: Frequency + Amplitude

cipriani.fig.1.12.png

In this image, the upper waveform is softer or quieter than the lower waveform,

which is louder or more intense than the upper waveform.

Digital Audio — Properties: Frequency + Amplitude

Let’s consider how this results in the analogue reproduction of a digital sound wave;

that is, when we send a waveform to our computer’s DAC to be amplified over loudspeakers...

Digital Audio — Properties: Frequency + Amplitude

A speaker vibrating faster or slower corresponds to a change in frequency,

but the force of its vibration, that is, whether the speaker cone is displaced
or moves over a greater distance in the same amount of time,
corresponds to the loudness or intensity of sound.

Digital Audio — Properties: Frequency + Amplitude

If the cone moves fast and is displaced by a small distance or force,
the result will be a high and quiet sound.

If it moves slowly and is displaced by a larger distance or force,
the result will be a low and loud sound.

Digital Audio — Properties: Frequency + Amplitude

Linear Amplitude dBSPL
1 0
0.5 -6
0.25 -12
0.125 -18
0.1 -20
0.01 -40
0.001 -60
0.0001 -80
0 -inf

Like frequency, amplitude is represented on a number of possible scales.

And like frequency, amplitude is perceived on a linear scale
while its physical properties are measured on a logarithmic scale.

In music, we use a relative scale with dyammic markings
to express loudness, using symbols like ff and ppp.

Digital Audio — Properties: Frequency + Amplitude

Linear Amplitude dBSPL
1 0
0.5 -6
0.25 -12
0.125 -18
0.1 -20
0.01 -40
0.001 -60
0.0001 -80
0 -inf

Linear amplitude is one kind of digital scale.

Audio sample values are encoded into a sound file using linear amplitude.

0 is equivalent to silence and 1 is equivalent to the loudest possible sound.

On this scale, all possible levels of loudness are represented between 0 and 1.

Digital Audio — Properties: Frequency + Amplitude

Linear Amplitude dBSPL
1 0
0.5 -6
0.25 -12
0.125 -18
0.1 -20
0.01 -40
0.001 -60
0.0001 -80
0 -inf

dBSPL means sound pressure level (SPL) measured in decibels (dB)
and is a measure of pressure within the air carrying sound waves.

It is mapped onto a scale (dB) whose values represent a ratio
between a sound and a reference value, like the pressure value of silence.

Digital Audio — Properties: Frequency + Amplitude

Linear Amplitude dBSPL
1 0
0.5 -6
0.25 -12
0.125 -18
0.1 -20
0.01 -40
0.001 -60
0.0001 -80
0 -inf

Here, the reference value 0 represents the
loudest possible sound a digital audio system can handle.

All other values are increasingly negative as they become softer.
Silence is represnted as infinitely quiet (-inf).

Digital Audio — Properties: Frequency + Amplitude

http://acousticslab.org/psychoacoustics/PMFiles/Module05.htm

You will find a related scale dBFS printed on the faders of mixing consoles
(“decibels relative to full scale”).

0 is full scale and represents no change to the incoming signal.

Values above 0 represent amplification of the signal,

while values below 0 represent attenuation of the signal.

Digital Audio — Properties: Frequency + Amplitude

dBSPL Common Sounds
140 threshold of pain
130 jet taking off
120 rock concert
110 symphony orchestra fortissimo
100 truck engine
90 heavy traffic
80 retail store
70 office
60 normal conversation
50 silent house
30 leaves rustling
20 wind
0 weakest perceptible sound

Here is a decibel scale that reverses the reference frequency,
where 0 is now the quietest sound possible.

All other common sounds are described as pressure levels above silence.

Digital Signal Processing (DSP)

DSP — Synthesis vs. “Found Sounds”

from Sound Synthesis and Sampling by Martin Russ

What is Synthesis?

Synthesis is defined in the Chambers 21st Century Dictionary as:
“building up; putting together; making a whole out of parts.”

The process of synthesis is thus a bringing together.

The word synthesis is frequently used in just two major contexts:

the creation of chemical compounds

and production of electronic sounds.

DSP — Synthesis vs. “Found Sounds”

from Sound Synthesis and Sampling by Martin Russ

Sound Synthesis

Sound synthesis is the process of producing sound.

It can reuse existing sounds by processing (or transforming) them,

or it can generate sound electronically or mechanically.

DSP — Synthesis vs. “Found Sounds”

Sound Synthesis vs. “Found Sounds”

In our class, sound synthesis is used to designate the latter, meaning:

Sound synthesis generates sound electronically or mechanically.

Found sounds will be our term for reusing existing sounds.

DSP — Synthesis vs. “Found Sounds”

Sound Synthesis vs. “Found Sounds”

For example, when we used a tape machine
to sample and transform existing sound,

that is, sound recorded with a microphone,

this process is separate from the generation of new electronic sound (synthesis).

DSP — Synthesis vs. “Found Sounds”

Sound Synthesis

In synthesis, basic waveforms are often used:

For example, sine, sawtooth, triangle, and square waves.

These waveforms are “built up” and “put together” to form a “whole.”

Often, this “whole” is more complex that its individual, “pure” waveforms.

DSP — Synthesis vs. “Found Sounds”

Sound Synthesis

Synthesis is produced using a synthesizer:

a machine that is used to electronically or mechanically produce sound.

DSP — Synthesis vs. “Found Sounds”

For example, this Buchla synthesizer passes electronic signals through many colored patch cables,
and is controlled by many knobs, buttons, and sliders that Suzanne Ciani performs with.
We hear the waveforms produced by the synthesizer through our speakers.

DSP — Synthesis vs. “Found Sounds”

Or this Yamaha DX-7 synthesizer, which uses a familiar keyboard as its user interface,
and is controlled in a much more conventional and musically-intuitive way.

DSP — Synthesis vs. “Found Sounds”

Synthesizers consist of many parts that produce or modify electronic signals,
such as oscillators and filters. An oscillator generates periodic waveforms,
such as sine waves. A filter allows some frequencies to pass through it,
while blocking others. Synths may use additional low frequency oscillators (LFOs)
to modulate their sounds; that is, to produce “change over time.”

DSP — Time Domain vs. Frequency Domain

Click through the embedded slide presentation above...

Works Cited

These slides and especially the images herein are from a number of well-known texts on acoustics, music, and musical technology below:

Coming Soon!