U.S. flag

An official website of the United States government

Other Access

The information on this page (the dataset metadata) is also available in these formats:

JSON RDF

via the DKAN API

Data Extent

Data from: A Community Resource for Exploring and Utilizing Genetic Diversity in the USDA Pea Single Plant Plus Collection

Included in this dataset are SNP and fasta data for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions.

These 6 datasets can be roughly divided into two groups. Group 1 consists of three datasets labeled PSPPC which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of three datasets labeled PSPPC + P. fulvum which refer to SNP data pertaining to the USDA PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore SNP names that are shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus.

For analysis, SNP data is available in two widely used formats: hapmap and vcf. These formats can be successfully loaded into TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel). Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file.

Descriptions of the first 11 columns in the hapmap file are as follows:

  • rs#- Name of locus (i.e. SNP name)
  • alleles- Indicates the SNPs for each allele at the locus
  • chrom- Irrelevant for these datasets, since markers are unordered.
  • pos- Irrelevant for these datasets, since markers are unordered.
  • strand- Irrelevant for these datasets, since markers are unordered
  • assembly#- required field for hapmap format. NA for these datasets
  • center- required field for hapmap format. NA for these datasets
  • protLSID- required field for hapmap format. NA for these datasets
  • assayLSID- required field for hapmap format. NA for these datasets
  • panel- required field for hapmap format. NA for these datasets
  • QCcode- required field for hapmap format. NA for these datasets

The fasta sequences containing the SNPs are also available for such downstream applications as development of primers for platform-specific markers.

For more information about this dataset, contact Clarice Coyne at Clarice.Coyne@usda.gov or coynec@wsu.edu.

FieldValue
Tags
Modified
2023-09-19
Release Date
2017-03-17
Identifier
b99c7cf3-a7c2-46c1-b02f-4f0d4ea60ffd
Spatial / Geographical Coverage Area
POLYGON ((-166.640625 -59.987997631212, -166.640625 83.254516804633, 194.765625 83.254516804633, 194.765625 -59.987997631212))
Publisher
Ag Data Commons
Temporal Coverage
January 1, 2013 to December 31, 2014
License
Data Dictionary
Contact Name
Coyne, Clarice
Contact Email
Public Access Level
Public
Program Code
005:040 - Department of Agriculture - National Research
Bureau Code
005:18 - Agricultural Research Service