Managing Digital Content


This guide was last revised 23 November 2009

Acquiring a basic knowledge and practice of records or collection management is a pre-requisite for properly organising and managing any volume of digital content beyond a few dozen items. Over time you can expect that your content will need to be added to, removed or updated. A workflow process and good content organisation will be essential to protect your content from loss or corruption. This can be greatly assisted by using a repository, content management, library or database system that supports open standards, and by adopting a few good planning practices.

Make it Digital has one detailed Managing Digital Content guide:

  1. Management Resources

Managing 500w

Digital lifecycle management

To successfully manage the whole lifecycle of digital content, some attention needs to be paid to designing your workflow process from acquisition or creation through to storing your content for future access and retrieval. Digital content is able to be re-used without limit or wear. If you follow good practice for managing your digital content, chances are in fifty or a hundred years items digitised or created today will still be viewable and usable. A consequence is that some content will never be 'archived' in the traditional information management sense of items no longer having an active use. Much of it may survive past the original creator, copyright term, and even the organisation responsible for keeping it.

Archives, libraries and museums have developed different professional practices for dealing with content, whether digital or not. Archives largely deal with unpublished records and may have clear expectations of disposal or destruction of some records after a fixed period of time (e.g. records kept for tax purposes). Lending libraries may retain publications only while they are in current or regular use, while research libraries may keep and preserve everything that fits within their collection scope. Museums have a primary focus on research and preservation and will also tend to keep everything that fits within their collection scope. Many smaller organisations have to cover all three practices, keeping records and publications as well as protecting items of historical value.

It is important that your digital workflow process takes into account the nature of your content and why it is being collected or kept. Good practice across all the collecting and recordkeeping professions includes the following:

  • having a written policy for managing digital content, covering items such as who can make decisions about accession and disposal, what can be collected, what standards and formats are used, and how the content and ownership rights (including copyright) will be dealt with if the organisation winds up
  • inventorising and describing the history of your digital content and any changes made through use of administrative metadata. This is important for tracking the source of an item and whether it has changed or been tampered with, something that is much harder to detect than with physical items
  • having someone responsible for entering information and doing so in a consistent way. Tied in with this is having an authentication or security feature that controls access and alterations to prevent accidental loss or misuse
  • using a hierarchical classification and filenaming system that shows relationships between groups of content and ensures that all content can be retrieved and accounted for in a consistent way
  • identifying the steps needed to back up, archive and migrate your content for long term preservation

Revisiting why you have content

Anyone interested in collecting or organising physical documents, media and objects can readily find guidance on how to manage them. There are decades of established practice and experience to draw on. But successfully working out how to translate this practice to digital items means going back to first principles.

Traditional collection and records management principles require attention to how materials are acquired, described, grouped, and if necessary disposed of. Items are given unique identifiers and organised logically to enable their discovery or retrieval. Processes for access to items and how they are tracked over time and place are created where materials are issued or distributed. Copies or duplicates are either excluded from the system or given their own unique identities.

Once digital items become part of this management system, things begin to change. So long as the digital copy remains on a physical item such as a CD-ROM, everything is fine. But bring computer networks into the equation and now copies are everywhere - on local hard drives, backed up on a server, cached in a web browser, attached in an email - each copy is infinitely copyable and susceptible to alteration.

Digital copies risk being unmanageable, as traditional record and information management systems are designed to manage original documents, not multiple copies. A possible response is to manage digitised copies like photocopies - as cheap low quality stand-ins for the real thing with no long-term value, to be kept outside the system. Born-digital materials may be version controlled to prevent unwanted duplication or changes and managed in their own separate system. Another response might be to digitise everything, turning copies and digital only versions into read-only records, while the originals are destroyed. In this case versions of everything may be controlled within one system. Finally, digitised copies may be managed as new resources, distinct from but complementary to original content.

Whichever approach you might take, if you are establishing a digital content management system for the first time it will require a change in practice. Your decisions will lead to different consequences and risks for the content being managed. This makes revisiting the question of why you have content an important preparation task, as the answer will shape your choice of software to support the complexity of systems required.

Are you collecting, recordkeeping or curating?

In the digital era the different fields of information and collection management are gradually converging, driven in part by the similarity of skills and issues in dealing with digital technology and software. Although the language used by libraries, archives and museums is still different, the standards and practices used in organising content digitally are being shared across disciplines. This is directly aided by standardisation of computer hardware and software, and the open standards of the internet.

Convergence can make it harder to determine the technology needs of your organisation. For instance are you creating a library or catalogue, where users can search for and use the resources you collect? Are you building a records system or archive to serve as the memory of your organisation or community? Or do you want to arrange and digitally display or share some of the more interesting or beautiful objects that you collect? With digital delivery the boundaries between these areas are no longer firmly fixed. It is very likely however that you will want to invest more in one of these areas than the others, making it important to know what your primary focus is.

Will your digital content have long term value?

The reasons for managing digital content may be very different from those for your physical collections. The growth in the availability of cheap high quality scanners, cameras and recorders is also shifting a lot of digital content from one-time use and disposal to multiple reuse and long-term retention. Digital technology is now largely a routine part of the content management workflow, providing opportunities as well as challenges for the way you do things. It is important to assess how long your digital content will have value for.

From the Google Books initiative to Hollywood's discovery that back catalogue movies can make money again on DVD, digitisation has been a convenient means of making content more widely available and accessible. In some areas such as audio preservation, digitisation is recognised as the preferred and most viable means of ensuring content remains available in the long term. Coupled with the amount of new digital content being made, we are witnessing the maturing of a form that will permanently sit alongside and in some cases replace non-digital content. With a growing acceptance and use of open and documented standards and formats, long term storage and retention of digital content is readily possible. As a consequence, treating digital content as solely the equivalent of disposable photocopies or ephemera is likely to be unwise.

The value of digital content is in the enabling of its sustained use and usefulness as resources and building blocks for creating new knowledge, services or content. Over time your investment in managing digital content will likely drive changes to your collection and records management practices rather than be driven by them. Embedded metadata, interoperable formats, data integrity and persistent references will be the norm in 21st century content management.

Digital management solutions

Information and collection management convergence have meant that the software systems used for managing digital content have become harder to classify. Although there is some overlap and mixed terminology, there are basically three kinds of software systems used for managing digital content. Databases, content management systems and repositories each have a different focus and function. You need to be clear about what your needs are before investing time and money into a new system, as some may not be suitable for your content, organisation or budget.

Databases

Database systems are typically used to describe and track content collections, whether digital or not. They are primarily metadata systems and do not hold or store content objects beyond basic elements such as sample content (e.g. extracts, thumbnails, short audio clips) that form part of the metadata.

Catalogue and record index databases are commonly used in libraries, archives and businesses as part of library and basic record management and tend to cover non-digital items or mixed items. These databases range from simple spreadsheets to server-based applications designed to point to and integrate with other information systems.

Inventory, library and collection management software also tend to be database systems, but with more extensive metadata (i.e. including technical and administrative metadata as well as descriptive and discovery metadata. For the differences on these types of metadata see our Describing Digital Content guide. These are widely used in museums and libraries where there is a need to inventorise and track collections, usage history, access restrictions, valuations and other aspects of a collection. They are distinct from content management systems, as like catalogues and record indexes, they do not store content objects beyond basic elements.

A database solution might suit your needs if one or more of the following applies:

  • your digital content is primarily metadata or a finding aid for non-digital item
  • your content is part of a series or combined grouping of digital and non-digital items
  • your content management needs are relatively simple

Content management systems

Content management systems include photo libraries, digital asset management systems, electronic records and document management systems (ERDMS) and web-oriented content management systems. These systems store or organise content as well as enabling collection and record metadata and the ability to track content and control versions.

Photo libraries and digital asset management systems are most often used in environments that produce or publish new materials using the stored digital content. Stock or commissioned images, graphical items, audio-visual materials, branding and re-usable text items are common examples. Professional photographers, publishers, broadcasters, audiovisual studios, web or graphic designers and others who deal with creation, editing and manipulation of digital content tend to require these kinds of systems.

Electronic records and document management systems are usually found in government, business or office environments and some publishers. They are used for managing digital document creation and storage along with digitised records in a controlled number of formats. They can vary from simple database systems containing client or customer information through to end-to-end document production and archiving systems. These systems are focused on version control and formal record creation, allowing parts of the collection to be made read-only or given restricted access while others can be freely edited. They should also allow digital archiving over time as records with no active business use get migrated to long-term digital storage.

Web content management systems are generally hosted on a local or remote web server, enabling the upload of a variety of content for use in developing website content or for making available as downloadable media from websites. Apart from administrative restrictions (such as limits to editing rights), web content management systems frequently have only minimal collection, version or metadata control. Some metadata may be uploaded with the content, while discovery metadata may be separately added to the website content as tags and keywords. Generally these systems are not suitable for long-term content or collection management, storage or archiving.

A content management solution might suit your needs if one or more of the following applies:

  • your digital content was originally created digitally
  • the digital copy will or does permanently replace the original
  • digital content items need to be able to be changed or added to
  • the digital content is being primarily managed and accessed on an intranet or the internet

Repositories

A digital repository is a software system for centralised storage, access and management of digital content generally in a networked environment. Some repositories focus primarily on storage and access, while others function as a digital equivalent of a traditional library system. These digital libraries are repositories that accept content and then classify, catalogue, preserve, curate and enable retrieval of the content throughout its lifecycle.

Repositories are commonly used by libraries, archives, universities, research institutes and other institutions to manage and store digital content over the long term. Many are capable of dealing with structured and unstructured content and a variety of formats. The software system may provide a different form of access to a repository depending on whether the user is submitting content, curating content, or retrieving content.

Repositories are suited to long term storage and archiving of digital content,and are the best means of ensuring preservation of a digital collection over the long term. However, they are also generally more complex to manage than databases and content management systems, and may require multiple software applications or modules (for instance to submit content, view content, or archive content) to achieve full functionality.

D Space Repository

An example of how a repository works. Image (c) Dynamic Diagrams www.DynamicDiagrams.com


A repository solution might suit your needs if one or more of the following applies:

  • digital copies will be permanently accessed as a substitute for the original
  • you are dealing with a variety of different and unrelated formats and collections
  • you mainly need permanent storage for user access and archiving

Assessing software system requirements

To ensure the longevity of your digital content, we recommend that you choose a system that at minimum:

  • manages the original file format of the content without alteration
  • supports recognised open standards and schemes for all metadata
  • has regular software updates and service support to maintain usability
  • allows transparent importing and exporting of all your content and metadata to assist migration


Ideally your system should also be one that:

  • maintains your content in open containers (e.g. folders) that can be accessed by any suitable software
  • maintains your metadata in open formats readable by any suitable software
  • embeds all metadata into the original file where the file format supports it
  • is built using open source software that allow different vendors or developers to provide service support over time
  • includes a built-in archiving or backup function to protect the content from accidental loss or destruction


If you are maintaining public records in digital format, Archives New Zealand has issued specific standards that are required to comply with the Public Records Act, including restrictions on document disposal and destruction. Information and advice on these are available from http://www.archives.govt.nz/advice.