Yep, on a bigger film set there may be a dedicated ‘data wrangler’ whose job is primarily the safe offloading and duplicating of media files. When you consider the number of people involved on a higher end film, not to mention any stunts, effects, or performances that are not easy to repeat, the cost of the media captured is extremely high. For this reason, professional workflows and positions exist to safeguard the data. But there’s more to the ever-evolving role of a modern DIT. A DIT or “Digital Imaging Technician” is responsible for media backup, possibly sync and transcode, applying a base grade called a “one-light” for the creation of viewable ‘dailies’, and occasionally the more technical job of camera consultation for digital productions. Because they take the “exposed” camera “mags” and apply a look for the crew to be able to evaluate, they are in some ways like a modern day film processing lab. They can play a role in helping communicate the intention of the cinematographer from set to post.
Here’s the general workflow: Camera operators will eject the “exposed” camera media, mark it with red tape and write an incrementing number to identify the “mag” (AKA “memory card”) and the camera it came from. For example, if it’s the seventh mag of the day from camera A it might get labeled A007. The camera crew will put red tape over the contacts of the memory card. The DIT would receive the card at a dedicated place on their cart, usually on the left side, put the red tape with the mag name on top of the card reader, put the ‘mag’ into the reader, copy the footage to a local hard drive array using special software that verifies the success of the copy (dump the entire contents of the card to the hard drive array) and then make more copies for safety as well as for the members of production and post production who will need the footage. This is the part of the job known as “Data Wrangling”, which, again, on bigger shows is a standalone job.
It’s also wise to keep a log of the cards you’ve processed so you can account for them. A simple Excel spreadsheet is sufficient. Accurate tracking of data and information is important in this job. Attention to detail, trustworthiness, focus and maintaining a clear head in high-pressure situations is all part of a good DIT.
Using your operating system’s file browser (AKA Mac’s “Finder” or Windows “Explorer”) to perform a file copy is generally frowned upon. Though the likelihood of anything going wrong is very slim, and the OS does a lot of file verification on its own, the industry standard, mandated by insurance companies, is to perform a checksum. A checksum looks at the data being copied and then performs a random bit verification comparison between source and destination, generating an alphanumeric hash as a unique descriptor of that data. If the hashes at the source and destination match, the data is assumed to be identical. This process makes the copy take longer, but adds a bit of security to the process of file transfer.
We cover sound sync later on. Sometimes that’s the domain of the DIT but it’s often not the case in modern productions. Sometimes the production isn’t even shooting dual system, the sound dept. is down mixing to camera so you don’t need to sync the slightly higher quality audio for dailies, and/or the post team (assistant editor) will handle sync for the show.
There are some limitations to Resolve for DIT work, but you can’t beat the price (free). Most of the limitations are speed-related. When you get an application dedicated to doing one thing it’s inherently more efficient. Resolve’s “Clone” feature on the first “Media” page can be used to generate a verified duplicate of the files on your camera media without purchasing any software. It may be slower than alternatives, but this whole idea of checksumming is debatable as your operating system does it on any file copy anyway.
Incidentally, if you’re transcoding files is that there’s no background rendering in Resolve so you may encounter a similar speed barrier. If you’re on a bigger production where you need to transcode a lot of material, having two computers available to you and using Resolve’s remote rendering feature will likely be required.
Bob Trim: “The biggest issue comes with exporting for AVID. As of version 10.1 of DaVinci, has had issues with the export of proper .mxf, .aaf, and .ale files needed for a feasible workflow within AVID. The issue is with both AVID and Resolve on the metadata side of things. You will need to find another tool for DIT work when as ked for AVI D deliverables. If the assets will be edited by you, or by a smaller posthouse where extensive VFX or audio work is not needed, DaVinci is a wonderful tool to use.”
Excellent software, slightly less intuitive than the alternatives, but it offers cataloging features as well.
This is an easy-to-use and very popular alternative.
Hedge is a bit of a unique solution in that it prioritizes speed and file verifications. Normally running a checksum can take you over 50% longer to run your copy. That makes a huge difference when you’re dealing with lots of data.
When your choice is between getting a copy 4-5 x faster, or even 1.5 – 2x faster and running checksums, you’re better of getting the copy faster and doing the visual inspection, right away. Most applications that run checksums, with the exception of programs like Hedge that trades off RAM caching the data for not adding separate read operations to the original media for checksum calculation, need additional time to prep a MHL or other set of hashes to act as checksums – there’s a lot that can happen in just an additional 10 minutes when it comes to copying footage, and if you want to reduce the actual risks of data loss, minimizing the time it takes to get that second copy is essential. This takes time, and if you cache to RAM and it’s not ECC RAM you’re actually introducing more risk that the copy will go wrong.
Hedge reads the data from the camera media, briefly caches it to RAM, and then writes it multiple output locations simultaneously. At the same time it generates hashes from the RAM cached data. RAM caching means that it’s capable of creating multiple copies and a media hash list faster than Finder or Windows Explorer can make a single copy. That’s a program with real value.
The caveat here is that by using a RAM cache, it actually increases the odds of a copy error, unless you’re running it on workstation class hardware with error corrected RAM (ECC RAM). Not by a small amount either – the probabilities of error go up by several orders of magnitude to the point where they can’t be ignored. Which means if you’re copy application is using a RAM cache, suddenly checksums add real value in verifying the copy went well!
Paul, one of the developers from Hedge:
“Other apps in the space all bypass the provisions the OS offers for copying and use a brute force method to determine if there’s an issue. We do it the other way around: we use the OS-built in issue reporting (that happens about once in 100m times), but crossverifiy the results. That results in about one in 1m files with an issue, which we then fix. Hedge is vastly more complicated under the hood than it seems. That we don’t have the interface tell you every bit and piece is deliberate: it’s what makes the UX good.”
All that being said, digital files are still subject to corruption. The most frequent culprits are the media itself or the camera that writes it. If your camera writes faulty data to a card, all a checksum does is verify that the corrupt data is corrupt on the source and destination. It’s always a good idea to perform as much of a visual inspection of copied data as time allows. The DIT is often the first person with the chance to really see what’s being shot on a good display and in a controlled environment. You can act as a sort of QC, advising the production if issues arise.
Formatting a card will set your heart racing on big shoots, but mismanaged duplicate data can be more dangerous than anything.
Quickly, before we talk of clearing the card, it’s good practice to not fill the card all the way. Don’t fill your media 100%. Try to swap for a new mag when the old one is around 80% full.
SD cards above 32GB will automatically be formatted with a file system call EXFAT which is both Mac and PC compatible. Develop a good habit now of managing your files from camera card to computer. Formatting offloaded cards from the computer means that if you or someone else puts an offloaded card in a camera they’ll often get a message from the camera suggesting a format. If they get a card with media still on it, there’s always the question of whether or not it’s been offloaded. Again, it’s easier for a camera operator to receive the formatted mag, check that it’s clear, and then perform a camera re-format. The operator will likely contact you if there is still media on the card. This can be done from “Disk Utility” on a Mac, but my favorite way is via a simple, free app called ParaShoot.
Once footage is safely offloaded, it’s not a bad idea to mark the camera media with a piece of green tape and send it back to the camera it came from.
This desktop application erases the card by flipping every bit of the first 2MB (2 * 1024 * 1024 bytes) of the card. Every 0 becomes a 1 and every 1 a 0. This destroys all file system information in a controlled way and can be easily reversed. What this means practically is that you can ‘erase’ your card but still recover its data if it hasn’t yet been written over. Because the card’s file system is corrupt, putting the card in a camera will trigger the camera to request a card format. Again, if the camera operator gets a card that does not request a format, they’ll radio in and verify you got the card.
The main array that you’ll put all your footage on needs to be sizable and speedy. Only after the footage is transferred to the main array is it transferred to the various redundant shuttle drives. Backup to the main array always comes first if you have the time for it. You can copy from that to your backup/shuttle drives simultaneously throughout the day. At day’s end if get a mag when everything is shutting down, you can copy from one card to all three locations (your shuttle drives and main array) all at once, but expect a performance hit of around 10%. Ultimately, one of those shuttle drives will go to the post house and be transferred to some sort of shared storage for use by the editorial team. Because of this, some projects will use a NAS with direct-attached-storage functionality to offload files on set and then plug it into the network when they arrive at the post facility. This is less common but one approach.
Pegasus and OWC Thunderbay are popular arrays. Because they are pretty comparable price-wise but pack a slew of additional features, QNAP NAS is a valid DAS and NAS all-in-one. It’s Thunderbolt 3 connection means the 8-bay units can read/write at speeds of over 1500MB/sec. Opt for the QNAP as soon as you want multiple machines connected to the same storage, on set or in post. That’s why it’s “Network” Attached Storage. In the studio these other machines would be editorial workstations. In the field, another computer dedicated to generating proxies is the biggest reason to do this, especially if you’re using Resolve to transcode.
Offloading the card to a RAID array is not what we mean by creating backups. Making backups is essential for critical work. Redundancy is a big deal, and keeping a copy offsite (via physical drive or in the cloud) is an even more secure ideal. In the normal DIT workflow the RAID array has some redundancy but wouldn’t be considered a backup. The shuttle drives that are going to their various destinations are a great backup.
The positions of DIT and assistant editor can also overlap. Sometimes a DIT will also be asked to make dailies/rushes/transcodes. These terms are often used interchangeably so we won’t get too picky here, though I prefer dailies/rushes referring to files destined for above-the-line creatives and proxies/transcodes for files destined for editorial. That said, you’ll hear editors refer all the time to making an assembly cut off dailies. In short, the idea with all of them is the camera shoots to a high quality format that isn’t going to play, or at least not play smoothly, on a lot of different machines. The idea with all of these is converting or ‘transcoding’ from the camera original format to a format that either plays well for the editors in their edit bays, or streams well for creatives (potentially on their cell phones).
“Dailies” (UK “rushes”) are simply easy-to-share, smaller, video files that the creative team can receive daily to check the results of the previous day’s shoot. This term comes from the overnight processing of the film shot one day resulting in the lab returning developed film for preview the next day. Most modern productions will upload these “securely” to the ‘cloud’ so various creatives can view them at their own convenience. They’re light weight enough to be shared online, and they’ll often contain burn-ins: text overlays with useful information. The are transcoded to something ubiquitous like h.264. Frame.io makes this pretty simple.
“Transcodes/Proxies” can be a more broad term for files destined for editorial. These are files created in a way that guarantees they will edit smoothly in a shared workgroup in the Non-Linear Editor. These transcodes are a part of those above-mentioned files that will be put on a drive and sent to the post-production house. You may make one set of files used as both dailies for creative review and transcodes for editorial, but they could go to different places so make sure you know what’s expected.
“Burn-ins” may occasionally be requested. This is just metadata shown as text and ‘burnt in’ to your footage. It’s most often source timecode, file or clip name and possibly/rarely even scene and take info. Workspace>Data Burn In is the panel where you configure this in Resolve. “Project” applies to the whole project and “Clip” to just that one clip.
Sometime exposure or color balance issues can affect the way a creative or an editor might handle the footage. Some cameras shoot a flat-looking ‘log’ image. For these reasons, the DIT often employs some very primitive color correction to balance the shots. They’ll have a “one-light” grade applied to balance the footage into something that’s pleasant to view, but nothing artistically drastic. Again, this is why the DIT can sort of perform a similar function to the old film processing labs as they are first to ‘develop’ the footage.
Camera reports, script supervisor’s notes, and sound notes head to post with the DIT’s drives containing original camera media and one-light transcodes. Pomfort has good material on their website detailing this if you want more information.
You’ll get a confusing array of answers on what the “industry standard” folder structure is for moving camera media to a hard drive but there are some common ideas. I would first advocate that you find a way that makes sense for your workflow. Personally-tailored experience is far more valuable than other people’s opinions on the internet. If you’re working on a larger production, the best approach is to simply ask whoever you are delivering to what they want. If you’re the one creating dailies, backups and archival, find out from the post house how they want the files organized. The jobs of DIT, Data Wrangler and Dailies operator could be three separate positions, depending on the scale of the shoot, so again, knowing your place and what’s expected is paramount. There should be someone in the post house (post supervisor, editor or assistant editor) with an opinion on how things are organized. Here are some guidelines:
The placement of footage into daily (of half-day) folders and scene classifications helps clear up any misplaced scene or take labeling. It’s another form of QC and data verification provided by a DIT. Separating footage by day makes it easier to keep footage organized and ensure nothing goes missing.
Using CamelCase or _underscore in place of spaces is generally best practice. CamelCase is shorter, still quite readable, and it makes the hyphens and underscores a more apparent organizational tool. E.g. 20190808-TheProject-v01. I used to work more with underscores and hyphens but found it easier to solely use hyphens as separators. Again, seems like minutia, but doesn’t hurt to pick something and stay consistent.
Production Name Footage Day01-20190913 A_Cam A001 A002 Proxies (if camera-generated, I’d put proxies here so they stay by their online counterpart) B_Cam B001 B002 Sound (dump of sound departments cards, possibly only at lunch or end of day.) Looks Reports
Similar folder structure as it would look inside Finder:
The filename is Camera, Mag Number, Clip Number, Month and Day, Random 2-letter Hash. It tries to obviate file name repetition. Reel names are one way to avoid conforming issues. For example this was a shot I created from camera A, mag 54, clip 38, December 14th. ADD reel names if they aren’t there already in Resolve.
Canon’s XF-AVC update to C200 is similar:
I did a firmware update mostly because of customizable file-naming so not every first clip on a disk is named clip 0001.mp4. That’s a consumer-oriented compression scheme with a weak file naming protocol. Using MXF instead, I get the same compression, but in a different wrapper and with more robust support for things like useful filenames: A007C065_180712FR_NAME1.MXF
“If you’re working on a show like Vinyl that might have like, say a $200 million dollar budget for 9 episodes. That’s $25 million per episode and it’s 15 [filming] days. So that works out to roughly 2 or 3 million dollars a day and you’re the keeper of the footage. So if you screw up you might have cost production 2 to 3 million dollars in footage.”
“In pre-pro camera tests: That will be me, the DP, the colorist and maybe their dailies colorist is also there. We’ll sit there and tweak and tweak a deLOG to get a it to where everybody is happy and then the colorist will just export the deLOG and send it to me.”
“I have a 16 terabyte RAID 0 that I use for my on set storage. Everything is either Thunderbolt 2 or USB 3 for connectivity when offloading”
“The thing that you have to realise when doing large transfers is that whatever is the slowest part of your chain will dictate how fast things go. So if all your hard drives are plugged directly into your computer but then your card reader is like the third in a chain on the USB hub, then your transfer speed is going to be dictated by your card reader.”
A ‘headless’ server has no input devices attached to it. These can be essential as “render nodes” for transcode.
“If you use a Windows PC as a transcoding workstation you just have to hook it up via thunderbolt to the Mac and setup a homegroup with file access. Thunderbolt is able to emulate a 10GbE connection giving the Mac access to whatever storage is attached to the PC. Of course, that’s in theory (I haven’t tested it yet).”
“We are running the TVS-882ST3 as on-location storage for our DITs. Works great in the field with fast TB3 transfers and back in the office super fast transfers with the built in 10GbE to our main servers. Only thing that isn’t working as expected is the T2E (thunderbolt to Ethernet) functionality where you can use the nas as 10GbE interface for your TB3 equipped Mac.”
“MysteryBox: Regardless of what we’re putting the footage on for a client later, we always copy to a local RAID first: a RAID 5 Promise Pegasus2 R8 at the office, or a G-Technology G-SPEED Shuttle XL for on location transfers.”
Organization is important