Ag Data Commons - Collection Development Policy
The Ag Data Commons includes resources from a range of research domains that relate to agriculture. These include subjects such as agronomy, genomics, hydrology, soils, agro-ecosystems, sustainability science, and economic statistics. Data included in the Ag Data Commons received partial or full USDA funding or support.
Ag Data Commons data materials support agricultural policy, research, scholarship, or teaching. Data accepted into the Ag Data Commons includes subject content as identified by the scope of the general USDA National Agricultural Library Collection Development Policy.
The Ag Data Commons collection focuses on United States agriculture and related topics. A review may justify inclusion of worldwide data displaying application to U.S. agricultural issues.
The Ag Data Commons focuses on current research data. A review may justify inclusion of historic data displaying application to current agricultural issues.
The Ag Data Commons currently supports and accepts English language data submission and retrieval.
Preferred file formats
Users may submit data in a standard format accessible through current file viewers. The Ag Data Commons strongly encourages data submitted in machine-readable formats such as CSV. The Ag Data Commons discourages non-machine-readable formats such as PDF. The Ag Data Commons reserves the right to re-format any submitted data. Ag Data Commons curators may reformat data at their discretion for preservation purposes, or to make data reusable by a diverse set of users.
List of acceptable file formats
The Ag Data Commons accepts the following file formats:
7z avi csv dat doc docb docx dot dotx fasta fastq gif gz hdf htm html jpeg jpg json kml kmz mov m4a m4v mp3 mp4 mpeg oga ogg ogv par pdf png pot potx pps ppsx ppt pptx rtf sldx stc sti stw svg sxc sxg sxi sxw tar tsv txt weba webm webp xhtml xls xlsx xlt xltx xml zip
The Ag Data Commons reserves the right to accept additional formats, or to discontinue acceptance of others. Users may submit data in an unsupported format for direct upload in a zipped file, with a list of contents included in the file description. The Ag Data Commons does not permit executable files uploaded directly or zipped, including .exe, .bat, .ipa, . reg, etc.
Types of Materials Collected
All dataset records created on the Ag Data Commons must include at least one of the following: Dataset, Database, or Software tool.
Datasets comprise the majority of the Ag Data Commons collection, and represent many types of data. For instance, dataset may contain tabular data corresponding to a particular experiment or event, genomic sequences and assemblies, or multimedia materials including photo, video, design, and three-dimensional digital renderings.
Database materials include organized collections of data, typically structured to support processes requiring information search and retrieval. Submitters can directly upload databases to the Ag Data Commons, but the Ag Data Commons entries of this nature typically link to an outside source or distributor.
Software tools include materials created to help users process, model, navigate, and generally get the most out of datasets or databases. Tools included in the Ag Data Commons relate to datasets or databases embodying the Agricultural Research Service mission. Software tools usually link to externally hosted open source code.
The Ag Data Commons reserves the right to include digital objects not explicitly mentioned in this document. Curators may deem objects that do not meet the basic criteria for inclusion add value to the agricultural research community, or choose to include materials for preservation purposes. The Ag Data Commons curators reserve the right to re-evaluate the collection and user needs and amend the list of acceptable formats and types of materials included in the collection.
Types of Materials Excluded
The Ag Data Commons may contain historic datasets of various origins, but only if data holders convert data to a current digital format. Digitization standards for materials should reflect the best practices of that material type.
Research products without data
The Ag Data Commons welcomes previously unpublished research products generated from data (e.g. tables, charts, figures), but only in conjunction with the full set of supporting data files.
These materials include digital slide decks and recorded conference or meeting presentations from agriculture-related seminars and events. The Related Content metadata section may link to presentations related to the data.
Permissions and Copyright
Data holders who submit research materials to the Ag Data Commons certify they possess adequate permissions to share this data publicly. Submitters must specify the data author(s) / creator(s) and copyright status at the time they submit the data. The Ag Data Commons encourages the Creative Commons CCZero public domain dedication for submitted data, but accepts data that complies with US Federal Open Data policies. The Ag Data Commons functions as a site for data sharing and re-use, and excludes data with burdensome restrictions.
Users may submit data files but request a hold on publication for specific data files. The Ag Data Commons supports data embargo for up to 30 months. The metadata record (descriptive information) published ahead of the data describes the dataset and alerts users to the data files’ expected publish date.
Accessioning and Retention
Essential metadata fields
All records ingested into the Ag Data Commons contain a core set of essential metadata fields. The Ag Data Commons publishes current expectations for metadata in the Submission Manual. These fields allow curation staff to complete workflow for a variety of services available through the Knowledge Services Division. Curation staff encourage additional metadata information, if available, to make submitted data more easily findable and reusable.
Manual data submission
Registered users may submit dataset records to the Ag Data Commons through the online submission form. Ag Data Commons curators review all manual submissions prior to publication. Curators may follow up with data submitters to complete missing information in order to meet the standards set for inclusion in the catalog.
Harvested records and data
As a generalist ag repository, the Ag Data Commons recommends data producers deposit their data in subject-specific repositories whenever possible to produce data and metadata in keeping with each research community’s standards. However, the Ag Data Commons may also choose to harvest dataset records and accompanying data files from external catalogs or repositories. Top criteria for developing a harvest includes a source catalog that contains a significant number of USDA-funded dataset records. If the source catalog contains records from a variety of agencies or funders and/or a variety of product types, the source must have or be willing to develop a mechanism to filter appropriate records to include only datasets supported by the USDA.
Prior to initiating a harvest, the source repository and Ag Data Commons must plan for an ongoing working relationship. All parties must agree in advance on services to be performed or delivered. The source catalog must provide the Ag Data Commons with their full contact information including name, email, phone number, and any other relevant information for the person(s) responsible for coordinating and developing the harvest.
The source catalog must have or be willing to build a machine-readable endpoint such as an API or web service which allows automated access from the Ag Data Commons platform. This provides a mechanism for transferring metadata and data as needed. The Ag Data Commons prefers for source catalogs to push their information whenever possible .
The designated essential metadata fields pertain to harvests as well as manually submitted datasets. Source catalogs lacking essential information must add the necessary information to the records at the source, or agree on a hard-coded value that applies to a specific field for all records at the time of ingest (i.e. hard code License for all ingested records to U.S. Public Domain). Ag Data Commons metadata librarians will work in conjunction with the source representatives to map information accurately between the two catalogs.
The Ag Data Commons and source repository must agree in advance on a schedule for harvesting metadata records, as well as for harvesting scheduled copies of data files in the case where datasets receive a DOI.
Upon acceptance into the Ag Data Commons, the Ag Data Commons retains all objects in the collection in accordance with the USDA National Agricultural Library's Collection Policy. The Ag Data Commons does not serve as a temporary storage facility for digital items. Once a curator approves a dataset for publication, they assign a DOI to each dataset record that contains locally stored resources. A dataset DOI assigned by the Ag Data Commons covers the metadata record and all associated resource files directly controlled by the National Agricultural Library. The Ag Data Commons reserves the right to maintain a backup copy of externally hosted data for archival and/or versioning purposes, but does not guarantee the persistence and preservation of data and other resources hosted elsewhere.
The National Agricultural Library reserves the right to remove items from the collection for any reason, including but not limited to copyright or other legal restrictions, outdated or older versions of materials, decreased relevance to the Agricultural Research Service mission, and changes in user information needs.