Most individuals and organizations cannot do long-term preservation on their own. Rather, they partner with an institution that has a specific mandate for preservation, such as an archive, historical society, museum, or library. You may also look to institutions engaged in gathering evidence, like human rights organizations, documentation centers, and courts and tribunals that have archives.
Working with an Archive
An archive that is potentially interested in acquiring your collection will likely want to first assess whether it has value and fits with their interests, and what the usage restrictions will be. The archive will also want to do an initial survey of your collection to understand its size, scope, and formats. Having an inventory or catalog of your collection can facilitate this process.
Working with an archive does not mean that you must give up your collection. With digital collections, you can easily deposit an exact copy of all your videos and documentation to an archive, while holding on to your own copy. Because of their investment, most archives will want rights to provide access to your collection to their users, and many will want you to eventually donate your collection. Some archives, however, are open to deposit relationships where you do not have to give up ownership of your collection.
Video as Evidence TIP
Depositing a video with a trustworthy archive that performs regular repository audits can simplify your work of maintaining an unbroken chain of custody.
Choosing an Archive
There may be one or many institutions or organizations interested in acquiring your collection. When choosing a potential archive, there are several factors you should consider:
Do you trust the archive (and the institution it may belong to) to take care of your collection and abide by its agreements with you (e.g. regarding security restrictions, access, preservation)?
Does the archive have the staffing and infrastructure to meet the processing, storage, preservation, and access needs of your collection?
Does the archive have a real interest in your collection, and experience and expertise in dealing with collections similar to yours?
Can the archive accommodate your expectations for security and privacy restrictions?
Do you want to retain ownership of your collection, or are you willing to transfer ownership to the archive? Some archives will accept collections they do not own, but some will not.
Do you own the copyright or have rights to the content in your collection? If not, can you provide the archive with information about third-party rightsholders? Archives need to understand the rights restrictions in order to provide access.
Are you able to get your collection to the archive?
If you work with an archive, draft a written agreement that outlines their acquisition of your collection and the terms of your relationship. This ensures that both sides clearly understand their rights and obligations, which ultimately protects the collection.
The main areas that the written agreement should address are:
What exactly is being acquired by the archive? What is not being acquired?
Ownership and rights
Who owns the collection, and what rights are being transferred to the archive?
How will materials with restrictions be handled? When, if ever, will the restrictions expire?
What is the archive responsible for? What are you responsible for?
MetadataAny information about a video: from technical information embedded in the file that allows the video to function, such as format and duration, to descriptive information about the content to help you understand or find it--such as keywords, security restrictions, geographic locations, and so on. Metadata is critical to any future use, and is important throughout the archiving process. Despite what is sometimes said, images almost never speak for themselves. They require context and description to make sense, to corroborate their factuality, and to be accessible beyond one person’s memory or desktop. Metadata can be automatically generated and embedded in the file, such as with technical metadata, or it can be manually recorded on an external medium, such as with descriptions, security flags, and keywords in a database. Metadata capture sometimes needs to be manually enabled on your device, such as with GPS or location services.
Metadata StandardA published document that describes how to create, use, and interpret metadata in a specific domain or for a specific purpose, which is intended to establish a common understanding among its users. A metadata standard defines the structure and meaning of its acceptable data elements, rules, and values. Many communities, including broadcasters, social scientists, and art museums, publish metadata standards to meet their descriptive needs.
CompletenessThe quality of having all of the information a record contained when it was created, and that its original context is maintained. Incomplete records are not as reliable as complete ones, since one might not know what information is missing and why. Transcoding a video to another format can reduce the image quality and discard metadata, making the video less complete and therefore less reliable. Keeping original video files, documenting context, and organizing videos in a way that maintains the original order of video files contributes to the completeness of the video records.
AuthenticityThe quality of being genuine, not fake or counterfeit, and free from tampering. Authenticity means that an object was actually created by the person represented as its creator, and that it was actually created at the time and place that is represented as its time and place of creation. Video footage that has been manipulated or altered but is represented as if it had not been, for example, is not authentic. To authenticate a video means to verify the relationship between it and its creator and point of creation. Documentation about who created something, when and where it was created, and the chain of custody can provide a starting point for this authentication process.
Original FilesIn the digital realm, the “original file” is any copy of a file that is exactly the same (i.e. bit-for-bit) as the file in question when it was created. This means that there are no accidental or deliberate alterations to any aspect of the file, including its format and technical specifications.
MalwareA term derived from "malicious software," and that refers to all computer viruses, worms, Trojans, and spyware.
SelectionThe process of identifying materials to be acquired, or to be preserved, because of their enduring value. Having selection criteria, or a selection policy, helps ensure you acquire and save only what is most important.
Chain of CustodyChronological documentation that shows who has held or controlled a video file from the moment it was created. The ability to show an unbroken chain of custody is one important indicator of the authenticity of a video, and therefore a factor in using video as evidence.
DownloadTo receive data from a remote computer system and save it in a local computer system. The inverse of download is "upload."
UploadTo send data from a local computer system to a remote one. The inverse of upload is "download."
MasterThe earliest generation or highest-quality output of a video from which duplicates are made.
Original OrderThe archival principle of maintaining files in the same order they were created. Original order is important to preserve context and the relationship between individual files, so that you can make sense of each file and of the whole. Keeping files in their original context makes them more complete and reliable.
IntegrityThe quality of being whole, unaltered, and uncorrupted. A file that is not intact may not be usable or may have decreased informational and evidential value. Videos files can lose their integrity if they are accidentally mishandled, deliberately tampered with, or if data corruption occurs in transfer or storage due to hardware or software malfunction. The best way to ensure integrity is to establish a system to check file fixity regularly (e.g. by computing hashes and checking them against a registry of previously computed hashes) and to restore any corrupted files from an intact copy.
CopyrightA legal protection intended to give the creator of original work exclusive rights to their work for a designated length of time. It gives the creator the exclusive right to copy, use, adapt, show, and distribute their own work, and the right to determine who else can copy, use, adapt, show, and distribute the work.
TranscodeTo re-encode a digital file to a different encoding scheme, such as converting an H.264/MPEG-4 AVC video to Apple ProRes. Transcoding is usually done when a video’s encoding is not supported by the system that needs to use it. Transcoding fundamentally alters the file, although lossless methods can allow the original data to be reconstructed from the transcoded data.
DerivativeA copy of a video generated from a master that is usually in a different format and of lower quality than the master. Derivatives can be made for various uses, such as web upload or DVD.
File FormatThe specification by which a digital file is encoded. Some file formats are designed to store particular kinds of data while others are more like containers that can hold many kinds of data. Common video file formats like Quicktime (.mov), AVI, and mp4 are container formats that contain video and audio streams, metadata, subtitle tracks, etc.
Command-Line InterfaceA way of interacting with a computer program which involves typing lines of text in a command-line shell. Some programs are only available with command-line interfaces, which facilitate their automation and use in programming scripts. However, command-line interfaces can be harder for casual computer users to interact with than graphical user interfaces (GUI), which use windows, icons, menus, and pointers.
Embedded MetadataMetadata that is stored within the digital object it describes. Some embedded metadata, such as file size, are essential to the functioning of the file, and are always written to the file by the device or software system. Other embedded metadata are non-essential and can be optionally added (e.g. rights information). Embedded metadata is not guaranteed to be accurate—for example, if your camera is set to the wrong date. Embedded metadata stay with the digital object as long as the object is intact, but can be intentionally stripped or altered. Embedded metadata can be lost if a file is transcoded to another format.
Graphical User InterfaceA way of interacting with a computer program that involves using windows, icons, menus, and pointers. Most computer users are familiar with graphical user interfaces. GUIs can be easier for casual users to interact with than command-line interfaces (CLI), which require commands to be typed as lines of text.
Hash FunctionAn algorithm that computes a hash value or checksum from any set of data, like a file. Common hash functions include MD5 and SHA1. Hash functions are used to check file integrity and for security purposes.
Hash ValueThe string of alphanumeric characters that results from running a hash function algorithm on data or a file. The hash value of a file will always remain the same as long as the file is unchanged, so it can be used to identify altered, corrupted, and duplicated files.
CatalogingCreating and organizing descriptive information in a structured way so that resources can be found, used, and understood. Cataloging expands on basic metadata, and enables users to access content in multiple ways.
Media ManagementThe process of keeping track of media, such as the video files in your collection, and overseeing any actions performed on your media, such as backup, refreshment or migration. Media management can be performed manually, or with the aid of a software system (e.g. a media asset management (MAM) system).
OutputTo export a completed video at the end of the post-production process. It is important to always output a master.
Edit Decision List (EDL)A document used in video post-production that contains a list of clips used in an edited video. EDLs originate from older film and video workflows when editing was a two stage process. Today, they can be used to move editing projects from one software or system to another. EDLs also provide useful documentation, showing what source files were used to create an edited video.
Unique IdentifierA number, word, or symbol for unambiguously identifying and distinguishing an object from other objects in a set. Common everyday unique identifiers include computer logins, credit card numbers, tax ID numbers, and so on. Applying unique identifiers to video files makes it easier to identify, distinguish, and organize videos and related documents.
Information PackageA self-describing container - usually a clearly named folder or directory - used to keep media and its related documentation or metadata together.
FindabilityThe ability of a user to easily find what they are looking for.
BackupA copy of data, stored in a secondary location, which is used to restore data in the primary storage location that is corrupted or lost. Restoring involves copying data from the backup to the primary storage to replace the corrupted or lost files. Backing up is a storage strategy that allows you to recover from data loss.
InteroperabilityIn an information technology (IT) system, the quality of being able to exchange information with another system and being able to use that information. Using widely adopted formats, metadata standards, and controlled vocabularies enhances interoperability.
RepositoryA system that acquires, stores, monitors, preserves, and provides access to its resources, run by an organization committed to providing long-term access to authenticated content to its users. A repository requires significant infrastructure to build and maintain.
Redundant Array of Independent Disks (RAID)A storage technology that combines multiple hard drives together to provide fault tolerance and better performance. Data is spread out across the drives, along with additional calculated data, so that data can be re-generated if part of the array fails. RAID protects you against data loss in the case of hardware failure. Unlike backup, RAID does not offer protection against file corruption or deletion, or data loss to malware, theft, or natural disaster.
SynchronizationThe process of ensuring that computer files in one location are copied to one or more other locations on a regular basis. Synchronization is also referred to as mirroring or replication. Unlike backup, synchronization does not allow you to go "back in time" to recover lost or altered files.
RefreshingThe process of copying data from one storage medium to another to ensure continued access to the information as the storage medium becomes obsolete or degrades over time. It is one strategy for avoiding loss of digital information.
FixityRelated to integrity, the quality of being unchanged over a given period of time. Fixity maintains the authenticity of an object over time, and is key to the concept of preservation. Long-term fixity requires good policies and handling practices, sustainable infrastructure, and strong security. Regular fixity checks (e.g. computing and comparing checksums) are used to detect changes.
FirewireAn interface standard for transferring data between digital devices, especially audio and video equipment. Developed by Apple in the 1990s, FireWire is becoming obsolete.
EncryptionThe process of encoding your files using a cryptographic algorithm so that only authorized parties with a “key” (e.g. a password) can decrypt them. The two main types of encryption are symmetric-key and public-key. Symmetric-key encryption uses the same key, or password, to encrypt and decrypt information. Public-key encryption uses one key to encrypt, and a different one to decrypt, and is more secure.
Network Attached Storage (NAS)Computer data storage that is accessed through a network. A NAS appliance is a computer that is specially designed to store and serve files over a network.
Storage Area Network (SAN)A dedicated network of storage devices shared among multiple servers, designed for fast access and large data transfers.
Data ModelA description of the way that data is structured in a database. It can define what types of things the data describe, what types of data are included in the descriptions, and how different types of things relate to each other.
Controlled VocabularyA predefined list of terms used to ensure consistency in cataloging. Since there is usually more than one way to describe or refer to a concept, choosing one term eliminates guesswork and circumvents the normal ambiguities of language (and spelling). Imagine searching for “Doctors” only to later learn that some records use the term “Physicians”. Consistent vocabularies increase the findability of records.
Access PointsA predefined list of terms used to ensure consistency in cataloging. Since there is usually more than one way to describe or refer to a concept, choosing one term eliminates guesswork and circumvents the normal ambiguities of language (and spelling). Imagine searching for “Doctors” only to later learn that some records use the term “Physicians”. Consistent vocabularies increase the findability of records.
EntityIn data models, an entity is any "thing" that is identified and described with data. For example, if a database keeps track of the model, year, and license plates of all the cars in a sales lot, each car is an entity.
PreservationThe process of ensuring the long-term accessibility of authenticated content. Digital preservation involves preventing loss or damage to digital objects, and extending their existence beyond the lifespan of their storage media or technology. Preservation requires ongoing resources, commitment and actions.
ObsolescenceThe process of becoming out-of-date and unsupported by available technology. Video cameras, video formats, storage media and storage devices, can all become obsolete over time. The obsolete technology is functional but is unusable because the other technologies they depend on no longer support them. An old video camera, for example, may not be able to plug into new computers, or an old video format might not be playable on new desktop video players.
MigrationThe process of re-encoding or transferring data from one digital or physical format to another to ensure long-term accessibility of the information as the format becomes obsolete and unusable over time.
ArchiveAn organization made up of people and systems responsible for preserving records and documents of enduring value and making them available to a designated community. Archives are sometimes parts of larger organizations, such as universities, public libraries, media centers, or museums.
PetabyteA unit of measure for data. One petabyte (PB) is equivalent to 1,000 terabytes (TB), or 1,000,000 gigabytes (GB).
Application Programming InterfaceA protocol that specifies a way for a software application to communicate and integrate with a program that provides a service. Google provides APIs, for example, so that people can use its data, such as location data from Google Maps or video data from YouTube, in their applications.
In This Section
- Aspects of Long-Term Preservation
- Prioritizing for Preservation
- Working with an Archive
- Other Preservation Options
- Digital preservation is never-ending and requires an ongoing commitment of resources.
- Preserving videos requires regular refreshing on new storage media and migration to new usable formats.
- Prioritize videos for preservation based on their archival value, uniqueness, contextualizing information, and whether you have rights to use them.
- Most small organizations cannot do preservation on their own. Consider partnering with an archival institution.
Key Concept: Refreshing
Key Concept: Repository
Key Concept: Obsolescence
Over time, file formats and storage media become unusable because the technology they rely on is unavailable.