Data Citation
Books and journal articles have a well-established infrastructure that makes them easy to cite and reference. Data should be considered legitimate, citable products of research. Data citations have the same importance in the scholarly record as citations of other research objects, such as articles and books.
When reusing the data of others, it’s critical to give proper attribution to the work of the original creator.
However, because the citation of data is a relatively new practice, the standards to follow are often unclear - referencing software like Endnote does have a template for datasets, but other requirements may mean the generated references need to be modified.
The elements that would make up a complete citation are a matter of some debate.
Digital Curation Centre identified a superset list of potential elements to include in the citation |
|
Author |
The creator of the dataset |
Publication date |
- year is the date when the dataset was published (not the collection or coverage date) |
Title |
As well as the name of the cited resource itself, this may also include the name of a facility and the titles of the top collection and main parent sub-collection (if any) of which the dataset is a part |
Edition |
The level or stage of processing of the data, indicating how raw or refined the dataset is |
Version |
A number increased when the data changes, as the result of adding more data points or re-running a derivation process, for example |
Feature name and URI |
The name of an ISO 19101:2002 ‘feature’ (e.g. GridSeries, ProfileSeries) and the URI identifying its standard definition, used to pick out a subset of the data |
Resource type | Examples: ‘database’, ‘dataset’ |
Publisher |
The organisation either hosting the data or performing quality assurance |
Unique numeric fingerprint (UNF) |
A cryptographic hash of the data, used to ensure no changes have occurred since the citation |
Identifier | An identifier for the data, according to a persistent scheme |
Location |
A persistent URL from which the dataset is available. Some identifier schemes provide these via an identifier resolver service |
The most important of these elements – the ones that should be present in any citation:
When deciding what citation style and elements to use, you should consider the following:
If these requirements are unclear or informal, DataCite recommends including the following elements:
Creator (Publication Year). Title. Version. Publisher. Resource Type. Identifier