Storing multiple copies is the most important strategy to ensure that your videos are not lost. You should make at least two copies of your originals, and keep them in different storage locations. Having copies allows you to recover content that has been accidentally deleted, tampered with, or become corrupted. Keep one copy onsite with your originals so you can access it quickly if needed, and one copy offsite in case something happens at your physical space like theft or flood.
Use a Backup Tool
For the parts of your storage that will be updated or changed (e.g. directories in which you add new videos or documentation, or that you plan to reorganize later), use a backup tool to make copies. Backup tools allow you to schedule regular backups to keep up with new content and updates. It can be more efficient than simple copying, because backup can be done incrementally. Backup tools also facilitate the restoration process when damaged or lost files need to be recovered.
Your computer likely comes with backup software installed, such as Time Machine (Mac), or Backup and Restore or File History (PC). There is also various backup software that you can purchase (e.g. Backup Exec) or download for free (e.g. Bacula).
What About Synchronization?
Some services like Dropbox and Google Drive offer synchronization, also known as replication or mirroring. Synchronization mirrors the files in one location (e.g. a directory on your computer) to another (e.g. your online account). Its purpose is to allow you to access the same content in multiple locations. Synchronization is an easy way to make copies, but it is important to note that, unlike regular copies or backups, synced locations are constantly updated to be identical to one another.
If you accidentally change or delete a file in one location, that deletion is also made in the synced location; you therefore cannot restore from a synced copy like you would from a regular copy or backup. Interestingly, some services like Dropbox additionally offer backup for synced files, allowing you to restore deleted files or previous versions of changed files.
Separate Your Copies
Having lots of copies will not always protect you from loss if all your copies are in the same place. Imagine for example, if a natural or man-made disaster were to strike, if your property were to be seized, or you were to be barred from entering the locale where your collection is stored. Keeping copies in different places is one of the most important things you can do to safeguard your collection.
Separate your copies geographically in case something catastrophic happens at one site, such as a flood or bombing. The appropriate location for your secondary copies depends on the threats you face. For example, if you are in a politically unstable region, keep a copy outside of the region; if you are in an environmentally vulnerable area, keep a copy outside of the area.
If your organization is under threat, whether politically, financially, or otherwise, consider keeping a separate copy with another organization. Choose an organization that you trust with your collection, and that would not be vulnerable to the same threats as you. See the “Preserve” section in the workflow to learn more about finding a long-term archive.
Storage Media Separation
If possible, store your copies on 2 different types of storage media (e.g. networked storage, external hard drives, offline data tape, etc.) so that you are protected against the particular vulnerabilities of each one. For example, if your networked storage is compromised by hackers, it is good to have copies on offline external hard drives.
An important way to safeguard your collection is to control who has physical and electronic access to your storage devices.
- Keep your storage devices in a physically secure place, accessible only to those who need to handle the hardware.
- Store your video files on a volume separate from your other files to limit the number of people who need to access the storage location.
- Set your file sharing and permissions so that only certain people have write-access to the volumes where your collection is stored.
- When you need to provide access to particular files to someone, copy the file and make it accessible in a separate location.
Ensure File Fixity
Storage includes making sure your stored files remain intact and unchanged over time, which can be helped by performing fixity checks. This means computing and comparing a file’s hash value (also commonly called a checksum) with a previously computed hash value. See “Keeping Files Intact (and Proving It)” for more on hashes.
Compute (and make a record of) a hash when you first receive a file, and again when you want to check it. As long as a file remains exactly the same, its hash value will always be the same. If the file is altered in any way, its hash value will be different. You can compare hashes to confirm that files have not become corrupted, that you have copied a file properly from one location to another, or to see if two files in different storage locations are the same as one another.
If a fixity check shows that a file has been altered or corrupted, restore (i.e. make a new copy of) the original file from one of your backup copies.
Refresh your Storage Media
More than likely, you have experienced the frustration of a failed hard drive, a jammed optical disk, or odd-sized memory card for which you have no card reader. No matter what kind of media or device you use, none are designed to last beyond the short-term. The actual lifespan of a piece of media or hardware depends on many factors such as its environment and use, but you should anticipate needing to replace your storage media and hardware every few years.
Any information about a video: from technical information embedded in the file that allows the video to function, such as format and duration, to descriptive information about the content to help you understand or find it–such as keywords, security restrictions, geographic locations, and so on. Metadata is critical to any future use, and is important throughout the archiving process.
Despite what is sometimes said, images almost never speak for themselves. They require context and description to make sense, to corroborate their factuality, and to be accessible beyond one person’s memory or desktop.
Metadata can be automatically generated and embedded in the file, such as with technical metadata, or it can be manually recorded on an external medium, such as with descriptions, security flags, and keywords in a database. Metadata capture sometimes needs to be manually enabled on your device, such as with GPS or location services.
The quality of having all of the information a record contained when it was created, and that its original context is maintained. Incomplete records are not as reliable as complete ones, since one might not know what information is missing and why. Transcoding a video to another format can reduce the image quality and discard metadata, making the video less complete and therefore less reliable. Keeping original video files, documenting context, and organizing videos in a way that maintains the original order of video files contributes to the completeness of the video records.
The quality of being genuine, not fake or counterfeit, and free from tampering. Authenticity means that an object was actually created by the person represented as its creator, and that it was actually created at the time and place that is represented as its time and place of creation. Video footage that has been manipulated or altered but is represented as if it had not been, for example, is not authentic.
To authenticate a video means to verify the relationship between it and its creator and point of creation. Documentation about who created something, when and where it was created, and the chain of custody can provide a starting point for this authentication process.
In the digital realm, the “original file” is any copy of a file that is exactly the same (i.e. bit-for-bit) as the file in question when it was created. This means that there are no accidental or deliberate alterations to any aspect of the file, including its format and technical specifications.
A term derived from “malicious software,” and that refers to all computer viruses, worms, Trojans, and spyware.
The process of identifying materials to be acquired, or to be preserved, because of their enduring value. Having selection criteria, or a selection policy, helps ensure you acquire and save only what is most important.
Chain of Custody
Chronological documentation that shows who has held or controlled a video file from the moment it was created. The ability to show an unbroken chain of custody is one important indicator of the authenticity of a video, and therefore a factor in using video as evidence.
To receive data from a remote computer system and save it in a local computer system. The inverse of download is “upload.”
The earliest generation or highest-quality output of a video from which duplicates are made.
The archival principle of maintaining files in the same order they were created. Original order is important to preserve context and the relationship between individual files, so that you can make sense of each file and of the whole. Keeping files in their original context makes them more complete and reliable.
The quality of being whole, unaltered, and uncorrupted. A file that is not intact may not be usable or may have decreased informational and evidential value. Videos files can lose their integrity if they are accidentally mishandled, deliberately tampered with, or if data corruption occurs in transfer or storage due to hardware or software malfunction. The best way to ensure integrity is to establish a system to check file fixity regularly (e.g. by computing hashes and checking them against a registry of previously computed hashes) and to restore any corrupted files from an intact copy.
A legal protection intended to give the creator of original work exclusive rights to their work for a designated length of time. It gives the creator the exclusive right to copy, use, adapt, show, and distribute their own work, and the right to determine who else can copy, use, adapt, show, and distribute the work.
To re-encode a digital file to a different encoding scheme, such as converting an H.264/MPEG-4 AVC video to Apple ProRes. Transcoding is usually done when a video’s encoding is not supported by the system that needs to use it. Transcoding fundamentally alters the file, although lossless methods can allow the original data to be reconstructed from the transcoded data.
A copy of a video generated from a master that is usually in a different format and of lower quality than the master. Derivatives can be made for various uses, such as web upload or DVD.
The specification by which a digital file is encoded. Some file formats are designed to store particular kinds of data while others are more like containers that can hold many kinds of data. Common video file formats like Quicktime (.mov), AVI, and mp4 are container formats that contain video and audio streams, metadata, subtitle tracks, etc.
A way of interacting with a computer program which involves typing lines of text in a command-line shell. Some programs are only available with command-line interfaces, which facilitate their automation and use in programming scripts. However, command-line interfaces can be harder for casual computer users to interact with than graphical user interfaces (GUI), which use windows, icons, menus, and pointers.
Metadata that is stored within the digital object it describes. Some embedded metadata, such as file size, are essential to the functioning of the file, and are always written to the file by the device or software system. Other embedded metadata are non-essential and can be optionally added (e.g. rights information). Embedded metadata is not guaranteed to be accurate—for example, if your camera is set to the wrong date. Embedded metadata stay with the digital object as long as the object is intact, but can be intentionally stripped or altered. Embedded metadata can be lost if a file is transcoded to another format.
Graphical User Interface
A way of interacting with a computer program that involves using windows, icons, menus, and pointers. Most computer users are familiar with graphical user interfaces. GUIs can be easier for casual users to interact with than command-line interfaces (CLI), which require commands to be typed as lines of text.
An algorithm that computes a hash value or checksum from any set of data, like a file. Common hash functions include MD5 and SHA1. Hash functions are used to check file integrity and for security purposes.
The string of alphanumeric characters that results from running a hash function algorithm on data or a file. The hash value of a file will always remain the same as long as the file is unchanged, so it can be used to identify altered, corrupted, and duplicated files.
Creating and organizing descriptive information in a structured way so that resources can be found, used, and understood. Cataloging expands on basic metadata, and enables users to access content in multiple ways.
The process of keeping track of media, such as the video files in your collection, and overseeing any actions performed on your media, such as backup, refreshment or migration. Media management can be performed manually, or with the aid of a software system (e.g. a media asset management (MAM) system).
To export a completed video at the end of the post-production process. It is important to always output a master.
Edit Decision List (EDL)
A document used in video post-production that contains a list of clips used in an edited video. EDLs originate from older film and video workflows when editing was a two stage process. Today, they can be used to move editing projects from one software or system to another. EDLs also provide useful documentation, showing what source files were used to create an edited video.
A number, word, or symbol for unambiguously identifying and distinguishing an object from other objects in a set. Common everyday unique identifiers include computer logins, credit card numbers, tax ID numbers, and so on. Applying unique identifiers to video files makes it easier to identify, distinguish, and organize videos and related documents.
A self-describing container – usually a clearly named folder or directory – used to keep media and its related documentation or metadata together.
The ability of a user to easily find what they are looking for.
A copy of data, stored in a secondary location, which is used to restore data in the primary storage location that is corrupted or lost. Restoring involves copying data from the backup to the primary storage to replace the corrupted or lost files. Backing up is a storage strategy that allows you to recover from data loss.
In an information technology (IT) system, the quality of being able to exchange information with another system and being able to use that information. Using widely adopted formats, metadata standards, and controlled vocabularies enhances interoperability.
A system that acquires, stores, monitors, preserves, and provides access to its resources, run by an organization committed to providing long-term access to authenticated content to its users. A repository requires significant infrastructure to build and maintain.
Redundant Array of Independent Disks (RAID)
A storage technology that combines multiple hard drives together to provide fault tolerance and better performance. Data is spread out across the drives, along with additional calculated data, so that data can be re-generated if part of the array fails. RAID protects you against data loss in the case of hardware failure. Unlike backup, RAID does not offer protection against file corruption or deletion, or data loss to malware, theft, or natural disaster.
The process of ensuring that computer files in one location are copied to one or more other locations on a regular basis. Synchronization is also referred to as mirroring or replication. Unlike backup, synchronization does not allow you to go “back in time” to recover lost or altered files.
The process of copying data from one storage medium to another to ensure continued access to the information as the storage medium becomes obsolete or degrades over time. It is one strategy for avoiding loss of digital information.
Related to integrity, the quality of being unchanged over a given period of time. Fixity maintains the authenticity of an object over time, and is key to the concept of preservation. Long-term fixity requires good policies and handling practices, sustainable infrastructure, and strong security. Regular fixity checks (e.g. computing and comparing checksums) are used to detect changes.
An interface standard for transferring data between digital devices, especially audio and video equipment. Developed by Apple in the 1990s, FireWire is becoming obsolete.
The process of encoding your files using a cryptographic algorithm so that only authorized parties with a “key” (e.g. a password) can decrypt them. The two main types of encryption are symmetric-key and public-key. Symmetric-key encryption uses the same key, or password, to encrypt and decrypt information. Public-key encryption uses one key to encrypt, and a different one to decrypt, and is more secure.
- Make (at least) 2 backup copies of your originals. Keep one backup copy onsite for quick recovery, and one offsite in case of major disaster.
- For the parts of your storage that are frequently updated or changed, use backup software that can perform incremental backups.
- Synchronization – also known as replication or mirroring — is not the same as backup. Synchronization does not allow you to go “back in time” to recover lost or changed files.
- Separate your copies in different geographic locations, on different media, and even with other organizations.
- Control physical and electronic access to your collection to prevent accidental or deliberate tampering and deletion.
- Use hashes – also known as checksums — to periodically check your files for errors to ensure data integrity.
- Consider your available IT support, nature and size of your collection, and access requirements when choosing storage media and devices.
- Different storage media and devices are ideal for different situations. Choose the ones that suit you.
- Fault tolerant storage (i.e. RAID) can protect your files when hardware fails, but it is not the same as copying or backup.
- Anticipate the need to refresh (i.e. replace) your storage media and devices every few (approximately 3-5) years.