Skip to Main Content
Main site homepage

Research Data Management

Information and guidance for researchers about managing data and writing data management plans.

What is Metadata?

Metadata is a type of documentation that describes data. It allows users to understand what the data is, how it was created, who created it, and how it may be used. It also facilitates search and retrieval of the data when it is deposited in a data repository. Metadata can be embedded directly into data files (for example, as XML headers) or stored separately as a readme file, codebook, or lab notebook. Specific disciplines, repositories, or data centers may guide or dictate the content and format of metadata using a formal standard. Metadata may also be required by your funding agency when preparing data for preservation or sharing.

Metadata standards are created to facilitate searching similar items by using similar terms and constructs to describe them. A metadata record consists of all the metadata elements describing an object. Metadata records are often expressed in XML or other machine-readable formats for easy integration with systems.

                                                                                                                                                              

Metadata Standards

Your field of study may already have developed a metadata standard; if so, it is advisable to adhere to the practices of your discipline. Some examples of domain-specific metadata standards are:

  • DDI (Data Documentation Initiative) - standard for social, behavioral, and economic sciences
  • EML (Ecological Metadata Language) - standard for ecology disciplines
  • CSDGM (Content Standard for Digital Geospatial Metadata) - standard for describing geospatial data
  • AVM (Astronomy Visualization Standard) - standard for astronomical imagery
  • Darwin Core - standard for biodiversity informatics

The Digital Curation Centre provides a disciplinary metadata guide that lists metadata standards by discipline.

If your discipline doesn't have an established metadata standard, it is recommended that you use a general-purpose standard such as Dublin Core, which is basic, domain-agnostic, and widely used. Dublin Core is a simple set of 15 elements that was developed to describe all types of networked resources.

                                                                              

Recommended Metadata Elements

The following are the minimum metadata elements that are suggested for describing your data (source: University of Arizona Libraries, http://data.library.arizona.edu/data-management-tips/data-documentation-and-metadata).

Title Name of the project or collection of datasets
Creator Names and institutions of the people who created the data
Date Key dates associated with the data, such as dates covered by the data or date of creation
Description Description of the resource
Keywords or Subjects Keywords or subjects describing the content of the data
Identifier Unique number or alphanumeric string used to identify the data
Coverage (if applicable) Geographic coverage
Language Language of the resource
Publisher entity responsible for making the dataset available
Funding Agencies Organization or agency who funded the research
Access restrictions Where and how your data can be accessed by other researchers
Rights Any known intellectual property rights held for the data
Format What format your data is in

More Information

The following links provide additional information that can help you in creating metadata for your project:

 

*All graphics on this page were created by Jørgen Stamp and published under a Creative Commons Attribution 2.5 Denmark License (www.digitalbevaring.dk).