Skip to main content

Data Management: Documenting Your Data

a guide to best practices for curating your research data

Metadata Standards

Using established metadata standards will ensure that your data is properly represented in databases and search results, and that it will be interoperable with future data tools.  If you are interested in delving into the standards themselves, consult the following links:

  • Dublin Core: a general purpose metadata standard for describing networked resources.
  • DDI: The Data Documentation Initiative is an effort to establish an international XML-based standard for the content, presentation, transport, and preservation of documentation (i.e., metadata) for datasets in the social and behavioral sciences. 
  • MODS: Metadata Object Description Schema, a common library standard. METS (Metadata Encoding and Transmission Standard) is a useful variation.
  • There are also discipline-specific standards, such as the Content Standard for Digital Geospatial Metadata (CSDGM) and ABCD (Access to Biological Collection Data)

How to Document Your Data

Documenting your data is simply providing sufficient descriptive information about your data so that it will be identifiable, understandable, and usable in the future.  

This is much easier to accomplish if you document your data at each stage of the research process, rather than attempting to recreate information at a later stage.  Since documentation is data about data, it is commonly known as metadata.

A minimal set of metadata for any dataset should describe the Title, Creator (and any additional Contributors), Date, Format, Subject, Unique Identifier, and Description of the specific data resource.  In addition, description of the Coverage of the data (spatial or temporal), Publishing Organization, the Type of Resource, Language, the Rights associated with the resource, any Related or Source material that the data derives from, and associated Funding or Grant information.  If the data is updated over time, versioning information should be provided.

Providing Guidance for Future Users

An equally important part of documentation is the provision of the information necessary to fully understand and interpret the data.  At a minimum this should include a file manifest and a short text describing the dataset, being sure to include any information that is not adequately represented in the structured metadata. It can also include codebooks or variable descriptions, documentation of experimental methods, provision of software code used in analysis, discussion of the file structure and relationships, and so forth.  Again, it is easier to collect this as the data is created rather than after the fact.

Most data repositories and archives (see Publishing Your Data) allow the submission of supporting documentation.  And even if you have no plans to publish or distribute your data, keeping good records of the data in your project as it evolves will pay dividends by helping you and your research team work easily with the data over time.

Data Librarian

Ryan Womack
Alexander Library

169 College Avenue

New Brunswick, NJ 08901 USA

Website / Blog Page
Subjects:Data, Economics