Ag Data Commons DOI Guidelines
The purpose of this document is to provide management guidelines for the assignment and maintenance of Digital Object Identifiers (DOIs) for Ag Data Commons records.
2. DIGITAL OBJECT IDENTIFIERS
A DOI, or Digital Object Identifier, is a persistent and unique identifier assigned to an object. This DOI permanently identifies content and related metadata for an object over the course of its lifecycle. DOI strings, in combination with a URL prefix, resolve to internet locations. Information about a digital object may change over time, including where to find it and who owns it, but its DOI will not change. The benefits of a DOI include greater discoverability and access to uniquely identified content, accessibility for long-term use, and citation of publications and research data for impact analysis.
DOIs require a commitment from the provider to maintain the URL associated with the DOI.
The Ag Data Commons DOI, formatted as 10.15482/USDA.ADC/XXXXXXX, will resolve as a URL by adding a prefix of https://doi.org/10.15482/USDA.ADC/XXXXXXX. This address resolves to the Ag Data Commons metadata record page for the given data resource, and not directly to the resource itself.
3. AG DATA COMMONS DOIs
The Ag Data Commons reserves its DOIs through an agreement with the Interagency Data ID Service of the U.S. Department of Energy Office of Science and Technical Information.
3.1. DOIs FOR LOCALLY PUBLISHED RESOURCES
The Ag Data Commons can mint a DOI for any record with data resources uploaded directly to our database, provided the data never received a DOI previously. Because the Ag Data Commons manages this data directly, data management follows policies and guidelines outlined in the Ag Data Commons policy section. Digital objects with a DOI assigned from another source should use the existing DOI instead. The Ag Data Commons mints DOIs for all locally held data records without existing DOIs unless otherwise specified. We may also mint DOIs for data hosted in other repositories managed directly by the National Agricultural Library.
3.2. DOIs FOR EXTERNALLY PUBLISHED RESOURCES
The Ag Data Commons can mint a DOI for data resources originally published outside of the National Agricultural Library where the digital object meets the following criteria:
- The data resource does not already have a DOI or other recognized global persistent identifier assigned to it.
- The data resource is a citable contribution to the scholarly record.
As a contribution to the scholarly record, the resource will be stored and made accessible in the long term. To this end, an exact copy of the complete data resource matching the scope of the Ag Data Commons metadata record resides permanently in a repository managed by the National Agricultural Library.
- For ongoing or continually collected / updated data, the data source and Ag Data Commons must agree on a versioning system and copy schedule prior to DOI assignment. See the Ag Data Commons Versioning Policy for more details regarding versioning options.
- The Ag Data Commons metadata record will link to any locally stored copies of the data, as well as the externally published data. The Ag Data Commons publishes locally stored copies of the data by default, unless specified otherwise in the agreement with the original publisher.
High quality metadata is available that supports the provision of mandatory elements required for compliance with DOI registration and Ag Data Commons record guidelines.
If your data does not meet these criteria but you are interested in obtaining a DOI for your dataset or resource, please contact the Ag Data Commons team (https://www.nal.usda.gov/ask-question).
3.2.1. Information required
If externally managed data meets the above criteria, the Ag Data Commons requires the following when arranging a DOI assignment agreement:
- Contact name, position, email, and phone number of the dataset/database manager
- The source repository’s organization name and relevant information
- A formal agreement outlining requirements for any versioning, updates, or maintenance to the metadata or data
- Workflow for obtaining initial copy and any subsequent copies of the datasets and metadata
- i.e. update/version schedule, end date (if applicable), endpoint for accessing data and metadata, access instructions/passwords, and applicable filters, among other details
All data originating from the same repository and following the same update and maintenance schedule may follow the same guidelines with a single agreement. Any variations in basic information, update method, or schedule from record group to record group require separate documentation. The Ag Data Commons will keep documents on file prior to issuing a DOI, and for follow up with data managers when necessary. The Ag Data Commons may request updated formal agreements on a schedule in keeping with industry standards.
4. METADATA REQUIREMENTS
4.1. METADATA RECORD PAGE
Rather than point directly to the data or other resources, all DOIs issued by the Ag Data Commons resolve to a metadata record page in the Ag Data Commons catalog that describes the associated resources. Users can create metadata records manually using the Ag Data Commons submission form interface (see the Data Submission Manual for further details). The Ag Data Commons may also harvest existing metadata published elsewhere (see the Harvest Policy for more information), or we may use a custom solution to create records depending on the metadata source.
4.2. METADATA CONTENT
Both locally and externally published data must meet the same metadata requirements to obtain DOIs for data and other digital products.
The following metadata is mandatory to create a record in the Ag Data Commons and mint a DOI:
- See our Description Field Pointers page for more information
- Indicate version schedule (if any) in the description
- Site URL
- This will be the Ag Data Commons metadata record page URL
- Author(s) or Primary Researcher(s)
- Ag Data Commons formats as Last Name, First Name
- DOI requires separated first name and last name
- Ag Data Commons, or the original publisher of the data
- Original publication date
- Year, Month, and Day - Formatted YYYY-MM-DD for DOI
- Product type
- The Ag Data Commons hard-codes this value to “Dataset” by default (Acceptable values include: Audiovisual; Collection; Dataset; Event; Image; InteractiveResource; Model; PhysicalObject; Service; Software; Sound; Text; Workflow)
- Title and URL (if externally hosted) for each data resource linked to the record
The Ag Data Commons requests additional information to create a complete, robust metadata record for the represented digital product that users may search and cite. The Ag Data Commons uses the Project Open Data metadata schema. Other highly desired catalog information includes:
- U.S. Public Domain or Creative Commons CCZero preferred
- In the absence of license information, this field will default to “License not Specified”
- Temporal coverage (if applicable)
- Geographic coverage (if applicable)
- Funding source
- See the CrossRef Funder Registry for acceptable values
- The Ag Data Commons catalog represents research supported in part or full by the USDA
These fields represent the minimum information required by the Ag Data Commons. Other metadata may be required upon review of the data resources in question. See the Data Submission Manual for more information on available metadata fields and best practices. An Ag Data Commons curator will review records prior to DOI assignment and publication for thoroughness and accuracy.
Granularity refers to the grouping and dividing of data and subsequent metadata. Data stewards should group their information according to the most likely citation needs. Consider the users and their citation expectations when forming record groups. Will the average user citing this data need a record with extremely detailed information about smaller, more distinct portions of the data, or will a more general data citation for the database as a whole be sufficient for the majority of data users? Do different subsets of the data have different authors or other information? Can you divide data into distinct citable sections, or does every database visitor tend to query, use, or cite unique subsets of data?
The data owner may decide metadata record granularity and subsequent DOI assignment – this can include individual records, groups of records, or the entire database as a whole. However, when creating a record for a subset of data existing on an external database, a landing page URL for that distinct dataset, ideally with no conflicting overlap of other data or information receiving a DOI, must be included for clarity. For example, the Ag Data Commons will not ordinarily issue a DOI to the database as a whole in addition to subset groupings, as the data represented in these record groups would be duplicated. An exception would be if some users cite the database as a whole, in addition to users who cite distinct subsets of data.
Three primary examples of data organization and DOI assignment strategy:
4.3.1. DOI for overall database
Example: USDA Database for the Proanthocyanidin Content of Selected Foods, Release 2 (2015)
DOI Assignment Level:
- Database (DOI)
- Database not easily or logically divided into distinct datasets
- Metadata for individual groupings does not exist
- Users would filter their own specific queries that tend to be unique subsets in every instance
- Subset of data 1 (DOI 1)
- Subset of data 2 (DOI 2)
- Subset of data 3 (DOI 3)…
- Logical groupings of data by type, location, date, source, network, etc., or clear parent-child relationships
- No overlap between subsets
- Distinct landing page URLs exist for the data subsets
- Metadata for each subset is distinct
- Users would only need to cite 1 or 2 of these subsets at a time
- Record 1 (DOI 1)
- Record 2 (DOI 2)
- Record 3 (DOI 3)…
- Metadata (i.e. authors, geography, dates) for each dataset is distinct
- Metadata for each individual record exists or can be readily created
- Users would cite content for each dataset individually
If metadata records with parsed information exist on the source database, the Ag Data Commons can harvest the specified records and group them according to the structure outlined by the data owner. Ag Data Commons curators can discuss these data organization scenarios with data managers to arrive at the best solution for each particular dataset or database.
5. RECORD UPDATES AND RESPONSIBILITIES
A significant percentage of data submitted to the Ag Data Commons change within the first year of publication, or on a regularly occurring basis as new data are processed and updated. Please see the Ag Data Commons Versioning Policy for more details regarding when to update an existing record and when to create a new record with a new DOI.
5.2. PROACTIVE LINK MAINTENANCE
The source database manager bears responsibility to alert the Ag Data Commons prior to any changes in URL or data location for any records bearing an Ag Data Commons issued DOI. This action mitigates the sudden discovery of broken data links. The Ag Data Commons also checks links on a regular basis to ensure they resolve.
5.3. WHEN LINKS FAIL TO RESOLVE
If a link to a data resource fails to resolve, the Ag Data Commons will contact the designated person on file for that data to rectify the issue. If communication with the designated database manager or data contact does not remedy a broken link in a timely manner, curators will remove the broken link from the resource and treat any locally stored data copies as the only available data resource. If the data manager provides a more specific reason for the data or web site’s removal, the record owner or Ag Data Commons curators may add that information to the description field of the metadata record for clarity.
5.4. RECORD AVAILABILITY
The Ag Data Commons never deletes or un-publishes metadata records containing DOIs we issue. Under exceptional circumstances, a data resource may be un-published due to legal obligations on behalf of the original publisher, owner, copyright holder or author(s); or on moral or ethical grounds if data with an error or sensitive information could be potentially damaging. The metadata record intends to provide accurate information as of the last data publish date, and updates occur as needed. Therefore, the DOI will always resolve to the Ag Data Commons metadata record page regardless of the state of the source data and/or external links.
The Ag Data Commons takes no responsibility for the timeliness, accuracy, completeness or quality of the information provided by data submitters, and cannot guarantee the fixity or permanence of externally hosted data or other resources. While the Ag Data Commons takes steps to capture complete local copies of externally published data prior to issuing DOIs, we cannot control the actions of the source repository with regard to altering or removing data at the source. The Ag Data Commons will make every effort to update metadata records to reflect the most accurate and current state of the data, but holds no liability for the actions of external repositories. Concerned researchers should contact the individual listed as the contact on the metadata record to inquire about repository data management plans and protocol.
If you have questions about specific issues regarding the Ag Data Commons DOI Policy, please contact the Ag Data Commons at https://www.nal.usda.gov/ask-question.
- DataCite Team (2019). Welcome to DataCite. Retrieved from https://datacite.org/
- Digital Object Identifier System Handbook. (2019). https://www.doi.org/hb.html
- Digital Object Identifier System. (2019). Retrieved from https://www.doi.org/index.html
- Gray, S., & Duke, M. (2014). Assigning Digital Object Identifiers to Research Data at the University of Bristol [Ebook]. Digital Curation Centre. Retrieved from http://www.dcc.ac.uk/sites/default/files/documents/finalAssigning%20Digi...
- Searle, S., & Lee, S. (2016). Digital Object Identifiers (DOIs): Management Guide [Ebook] (1st ed.). Griffith University. Retrieved from https://www.griffith.edu.au/__data/assets/pdf_file/0039/198795/Griffith-...
- Simons, N. (2012). Implementing DOIs for Research Data. D-Lib Magazine, 18(5/6). doi: 10.1045/may2012-simons