Metadata is simply data about data. Metadata describes your data so that others can find it, understand it, and possibly use it. The answers to the 20 questions below are also the kinds of metadata you may need to describe your data.
Metadata standards exist for different disciplines. A collection of some of these is available from the UK's Digital Curation Centre: http://www.dcc.ac.uk/resources/metadata-standards
From the blog post 20 Questions for Research Data Management – see https://datamanagementplanning.wordpress.com/2012/03/07/twenty-questions-for-research-data-management/
These twenty questions are designed to prompt and assist your thinking, as a research student, a postdoc or an academic researcher at the beginning of a research project, and to form the basis of a workable research data management plan that can both guide your on-going data management activities and inform others about the nature and availability of your research data.
They will help you determining how best to safeguard your data from loss, how to describe your datasets in ways that assist both yourself when returning to them in the future and others in their subsequent interpretation, and how to publish your data in ways that maximize their usefulness to others and bring maximum academic scholarly credit to yourself, to reward your efforts in acquiring, analyzing, describing, interpreting and publishing them in the first place.
You may not have immediate answers to all these questions. But, by seeking advice from your research supervisor, colleagues and others in your institution with responsibilities for data management, you should endeavor to discover them. Then, once in a while, you should revisit these questions and see whether your data management practices can be improved, updating your answers.
The nature of your data
1 What is the general subject discipline (domain, field) to which your research data relates?
2 What is the exact nature (range, scope) of your research data?
3 Who will own the data arising from your research, and the intellectual property rights relating
to them?
4 If you know at this stage, specify in what format(s), will you store your data in the short term
after acquisition?
Date descriptions, so that someone else can understand what the data are about (i.e. metadata, “data about data”)
5 When and where will you describe each of your research datasets, so that someone else can
understand them?
6 How will descriptive metadata be created or captured?
Data sharing and publication
7 With whom will you share your research data in the short term, before publication of any papers
arising from their interpretation?
8 For how long will you embargo your research data before it is published for others to see
and use?
9 Why is public access to your research data to be restricted (if indeed it is)?
10 Under what data-sharing license will you publish your research data?
11 What persistent identifier will be used to permit correct citation of your datasets?
12 What metadata will be published with the data to make them interpretable and reusable?
Data storage, backup and archiving
13 Where will you store your data in the short term, after acquisition?
14 Who is responsible for the immediate day-to-day management, storage and backup of the data
arising from your research?
15 How frequently will your research data be backed up for short-term data security?
16 Where will your research data be archived for long-term preservation?
17 When will your research data be moved to a secure archive for long-term preservation
and publication?
18 Who will decide which of your research data are worth preserving?
19 How (i.e. by what physical or electronic method) will you transfer your research datasets to their
long-term archive, under the curatorial care of a separate third-party, e.g. a data repository?
20 Who will be responsible for your data, once you have left your present research group?
Start off your research with good filenaming practices. These include:
All versions of data must be clearly identified.
Be consistent. Documentation is key!
from http://ucblibraries.colorado.edu/systems/digitalinitiatives/docs/filenameguidelines.pdf
Some file formats are better for the eventual preservation of your data. (See Share/Preserve for more about preservation.) Below are some preferred file formats for preservation, from the Georgia Tech Libraries website.
Examples of preferred format choices:
Where will you keep your data?
How will you back it up?
None of these are terrible, unless they are your only copy!
Use the 3-2-1 Rule:
From http://blog.trendmicro.com/trendlabs-security-intelligence/world-backup-day-the-3-2-1-rule/
Why Cite Data? From DataCite:
"Why is it so important to cite data? Books and journal articles have long benefited from an infrastructure that makes them easy to cite, a key element in the process of research and academic discourse. We believe that you should cite data in just the same way that you can cite other sources of information, such as articles and books. Data citation can help by:
Good data citation includes a persistent identifier, such as a DOI- Digital Object Identifier, URN- Uniform Resource Name, or Handle. (ICPSR).
How do you get a DOI? Often when you deposit your work in a repository, a DOI is assigned to each item for you. Repositories use a DOI Registration Agency, such as DataCite or CrossRef.
Basic recommended format from DataCite:
Creator (PublicationYear): Title. Publisher. Identifier. Or, slightly expanded format:
Creator (PublicationYear): Title. Version. Publisher. ResourceType. Identifier.
Example:
Irino, T; Tada, R (2009): Chemical and mineral compositions of sediments from ODP Site 127‐797. Geological Institute, University of Tokyo.http://dx.doi.org/10.1594/PANGAEA.726855
Recommended format from ICPSR:
Author,. (Date). Title. Version, Persistent identifier (such as the Digital Object Identifier, Uniform Resource Name URN, or Handle System)
Example:
Sidlauskas B (2007) Data from: Testing for unequal rates of morphological diversification in the absence of a detailed phylogeny: a case study From characiform fishes. Dryad Digital Repository. doi:10.5061/dryad.20