🎬 Sound Sync

You’ll recall that you have the option of recording your sound directly into camera or to an external device (dual system). The latter approach means you need to sync sound and picture in post production.

I’m still of the opinion that modern pre-amps are so good that you often should just record from an external pre-amp into camera on smaller shoots or when the camera is in easy proximity to do so. However, in the case where you can’t, you have three options for syncing sound to picture.

Timecode Sync

If you have to sync, this is the ideal solution. Every recording device has a matching clock which is written to metadata in the file. All the software has to do in post is match up the clocks.

Typically the “Master Clock” is the definitive source of what this timecode clock should be. The sound recordist will have a field mixer with a built-in crystal sync clock that keeps extremely accurate time. All other devices will be “jammed” from this source. You’ll need to make sure the receiving device accepts timecode in a compatible format (can be jammed) and that you have the correct cable to do so.

Sound Devices Mixer

Even when a camera uses timecode as a feature, it will generally not have a clock with the accuracy of a dedicated audio device. Usually the field mixer will jam a small external box containing an accurate clock and then that clock will be plugged in to ‘jam’ the camera.

Ambient Recording  NanoLockit Wireless Timecode Generator, ACN Compatible

I love the recent adoption of affordable devices like the Tentacle Sync which uses analog audio to save timecode. This means you can use a cheap camera with a headphone jack rather than requiring a dedicated timecode input/output. The inclusion of a timecode BNC port on a camera usually means it costs much more money. But devices like this Tentacle are simple to use: just jam sync the two tentacles together, plug one into your sound recorder, one into your camera’s mic input and you’re good to go.

Waveform Sync

Feel free to download and use the two files above to test both waveform and sync at home.

If you don’t have timecode, your next best option is usually to tell the software to analyze the contents of the audio tracks and see if there are any matching audio “scratch tracks” in the video files. This can be on-camera audio acquired by low quality internal camera mics, or a downmix sent from the sound department to the camera. However it happens, the camera’s recorded clips must have some sort of audio for this to work. It’s not a perfect solution, often ending in false positives and requiring some manual sync, but it can work well. Every modern NLE now has this feature, but PluralEyes still does it best, and if you’re syncing a lot it’s worth buying a license.

Syncing sound based on waveform is simple: just select any bins containing picture or sound (the CMD or Ctrl key will help you here); right-click and “Auto Sync Audio”>”Based on Waveform”

Manual Sync

The most tedious, by far, is the manual sync. Painful as this can be, it’s so often necessary, even on professional shoots. If, during production, a clapper or slate was used, this becomes easier. All you have to do is match up the closing of the sticks on the slate with its accompanying spike in the audio waveform. This is one reason a 2nd assistant camera will often call the word ‘marker’. It makes a nice aural cue that the correct waveform spike (sometimes there’s more than one) is the one following the ‘marker’. This is also why it’s important for an AC to be careful not to double slate. Also, be careful to watch for “second sticks” meaning the shot will be slated twice and you’ll need to sync with the correct waveform spike.

In the metadata section, we’ll look at how syncing this audio has added important video metadata as well.

In addition to syncing production sound, it’s nice to make sure all the audio in the project is converted to 48kHz and either 16-bit or 24-bit. Anything less than these can cause issues, and .wav and .aiff files will play nicer than .mp3 so convert the latter if you can’t get higher quality source material.

Understand the Workflow

On something like a feature film, the above-mentioned field recorder is capturing more than just one channel of audio. There could be five actors wearing lav mics, a boom mic above them, and several plant mics around the set.

When offline files are created for editorial, the assistant editor or DIT is syncing them to a mix track from the sound recorder. This is a mixdown of all the individual “iso” tracks. This means your various boom mics, lav mics, plant mics, etc. are all on one audio track and you can’t individually separate them. For the environment dailies are intended to suit, this isn’t a problem. For the progressive editor desiring maximum control however, it’s useful to sync to the full sound files with all of their individual channels. In some environments this simply isn’t feasible, but where possible it can also mitigate conform issues later on.

When we talk about syncing we should reference that a time will come when you’ll need to conform back to the original audio files. If you’re working in a small team, or by yourself, this will be another necessary part of the job after the edit is complete. This is covered in decent detail in the conforming section.

Realize that this production audio is only a portion of the final audio in a completed product. Keeping things well organized, and sync as accurate as possible, will make life easier for the sound editor (or potentially yourself) later on.

The 2-pop

We can’t conclude our sync discussion without mentioning the 2-pop. Though it’s being forgotten more and more in modern workflows, the famous 2-pop is still a useful tool used to verify sync before a reel of a motion picture or a broadcast event begins. It’s a 1kHz tone, played for a single frame, two seconds before the program’s first frame; it’s short duration makes it sound like a blip or ‘pop’. Visually, it’s usually accompanied by the SMPTE countdown leader (that analog clock countdown sign you’ve seen before).

A typical TV program starts at 1 hour or 01:00:00:00. You’ll hear this called the “First Frame Of Action”. This means the two pop will happen precisely at 00:59:58:00. FFOA is usually 01:00:08:00 in the film world (due to the duration of the SMPTE countdown leader) so the 2-pop occurs at 1:00:06:00.

It’s good practice to place a 2-pop at the end of a sequence as well. If sync has drifted it’s an easy way to verify it.