Audio Essentials

Poor quality audio is a plague. An audience is often enormously forgiving of picture quality, but not so with the sound, making bad audio a curse of amateur filmmaking. This site intentionally avoids tackling the issue of audio since others have done it better already.

Production

Rules of Production Audio

Mic Proximity

A cheap mic near the audio source is better than an expensive mic far away

Environment

Look at where you’re recording and listen to your environment. Watch out for small rooms with hard, parallel reflective surfaces. Treat the walls with acoustic blankets. Turn of a humming fridge and put your car keys inside.

Digital Audio Anatomy

Analog audio works by converting sound pressure waves into positive and negative voltages. In an analog to digital conversion, the audio is captured over time and stored as digital values.

Sample Rate

This is how frequently a digital sample is “taken” of your analog waveform. This will directly influence the frequency response of your audio capture. The sample rate must be double the frequency you’re capturing. Humans can’t hear beyond 20kHz so theoretically 40khz would be sufficient. Humans generally hear around 20Hz–20kHz.
44.1khz
48khz
96khz

Bit Depth

This controls the number of discrete levels your analog waveform is divided into. Sort of like dividing a $100 dollar bill into 2 $50 bills or 10 $100 bills. This directly affects the dynamic range (difference between softest and loudest sounds) you can capture.
16 bit is a common audio bit depth.
24 bits is worth the increase in file size because it does significantly better at separating low signal from noise. If you have the option to record 24 bits then use it and record at a lower level, giving yourself headroom before clipping.

In this image the bit depth is four so we have 16 possible values (4^2).

Input Levels

Digital audio is measured in DBFS or “decibels below full scale”. A digital audio signal is peaking at 0–it only measures sound as values less than the maximum it can record which is 0. Digital levels are always negative. You’ll always want to keep your peaks below 03dbfs. About -12dBFS is a very general level to aim for recording dialog. Record at 24 bits and record on the lower end to keep headroom.
DB (decibels) is a relative scale and has no quantitative meaning. A sound isn’t simply 50 decibels, though it could be 50 decibels louder than another sound.
Analog audio is measured in “Volume Units” (VU) where 0dBVU is 1.228 volts if you really want to know.

Microphones

The most common microphone types include:

Dynamic Mics

Generally speaking, more durable.

Condenser Mics

Require power. Potentially more fragile.

Ribbon Mics

Not often used in video production. More delicate.
Lavalier mics or “lav mics” are small microphones, usually condensers, that clip onto the talent and can oftentimes be hidden.
Shotgun mics or “boom mics” are directional mics that are usually positioned just outside the frame.
A receiver (Rx) and transmitter (Tx) can be used to send audio signals wirelessly, though it’s never as reliable as a wired connection. It’s important to monitor a wireless connection as interference is a very common occurrence.
Standalone Recorders
These are small units featuring both microphone and recorder in a single package. Though you sometimes can plug an external mic into them, that’s not their primary use. Here are two popular units:
Zoom H1
Tascam DR-05

Phone Recorders

You can buy microphones that plug into your phone, making the phone the recording device.
Polar Patterns
A microphone has increased sensitivity to a certain area around it.
Omni
Cardiod
Super Cardiod
Hypercardiod pencil mics don’t use interference tubes and are better for small indoor spaces. A shotgun mics interference tube uses phase cancellation which creates problems with early reflections. A shotgun mic is often a hyper-cardiod pattern with the interference tube to reject off-axis and rear sounds–this makes the shotgun mic more directional.

Gain Staging

An audio signal exists at different levels which must be managed at every point in the chain.

Mic Level

The level generated by most microphones.
1.5mv–70mv (1 millivolt is a thousandth of a volt)

Line Level

Level used for most audio mixers.
Pro line level is +4DBu (.5–1 volt or 750mv) (4 decibels above 0dbU)
Consumer line level is -10dBv (the “v” here means relative to 1 volt
(Keyboards and guitars are generally between mic and line level).

Speaker Level

10 volts
This is an output level designed for monitoring–the very end of the chain.
Preamp
Often times the mic isn’t the issue–it’s the preamplifier (“preamp”) that reduces your sound quality. This is the amp that boosts the weak mic-level voltage from your microphone. The preamps included inside prosumer DSLRS are notoriously mediocre. You can plug a very nice microphone into a DSLR but you’re still limited by the preamp. One option is to use a ‘hot mic’ with a strong enough level that the preamp isn’t actually amplifying much. Another option which adds bulk but increases control is to use an external preamp. This usually comes with additional benefits: physical gain controls, phantom power, XLR inputs.
Phantom Power
Most professional mixes will have a +48 volt “phantom power” for powering condenser microphones.
RMS is the average sound level, verses “peak” levels which represent the loudest levels. A limiter lets you bring RMS average up without clipping/distorting the peaks.

Physical Connectors

TRS
Tip Ring Sleeve
1/4″ and 1/8″ varieties
XLR
The most common professional audio connector you’ll see used with microphones. It’s a balanced connector meaning positive, negative and ground are separated.

Post Production

Post Levels
LUFS
Loudness Units Relative to Full Scale
Spotify will normalize to -14 LUFS. Master no louder than -9 LUFS and true peaks no higher than -1.
Aim to keep your individual tracks around -18dbFS (=0dbVU) with peaks around -10dbFS for mixing. This puts them at the level plugins expect them to be at. You can then balance them out relative to each other in the mix. Make sure to have headroom on all your faders. The loudest part of your combined audio should peak round -4 to -6dB on the stereo output.
Mastering aims to make the audio sound “louder” and more present as well as tunes it to different types of output scenarios. Automated services like LANDR and Aria actually work pretty well.
Normalization
Right-click on a YouTube video and get “Stats for Nerds”.

Plugins

EQ
Limiting
This one is the opposite of what it sounds like. Use it to “boost volume” without clipping.
Compression
Plugins
Noise reduction, stereo imaging, etc.
Saturation
A hot signal recorded to tape oversaturates it giving it additional harmonics, character and “grit”, and can be emulated with digital plugins.