File(s) stored somewhere else

https://academic.oup.com/g3journal/article/7/7/2161/6053605#supplementary-data

Please note: Linked content is NOT stored on Ag Data Commons and we can't guarantee its availability, quality, security or accept any liability.

Data from: Phased Genotyping-by-Sequencing Enhances Analysis of Genetic Diversity and Reveals Divergent Copy Number Variants in Maize

dataset

posted on 2024-02-13, 16:06 authored by Heather Manching, Subhajit Sengupta, Keith R. Hopper, Shawn W. Polson, Yuan Ji, Randall J. Wisser

High-throughput sequencing (HTS) of reduced representation genomic libraries has ushered in an era of genotyping-by-sequencing (GBS), where genome-wide genotype data can be obtained for nearly any species. However, there remains a need for imputation-free GBS methods for genotyping large samples taken from heterogeneous populations of heterozygous individuals. This requires that a number of issues encountered with GBS be considered, including the sequencing of nonoverlapping sets of loci across multiple GBS libraries, a common missing data problem that results in low call rates for markers per individual, and a tendency for applicability only in inbred line samples with sufficient linkage disequilibrium for accurate imputation. We addressed these issues while developing and validating a new, comprehensive platform for GBS. This study supports the notion that GBS can be tailored to particular aims, and using Zea mays our results indicate that large samples of unknown pedigree can be genotyped to obtain complete and accurate GBS data. Optimizing size selection to sequence a high proportion of shared loci among individuals in different libraries and using simple in silico filters, a GBS procedure was established that produces high call rates per marker (>85%) with accuracy exceeding 99.4%. Furthermore, by capitalizing on the sequence-read structure of GBS data (stacks of reads), a new tool for resolving local haplotypes and scoring phased genotypes was developed, a feature that is not available in many GBS pipelines. Using local haplotypes reduces the marker dimensionality of the genotype matrix while increasing the informativeness of the data. Phased GBS in maize also revealed the existence of reproducibly inaccurate (apparent accuracy) genotypes that were due to divergent copy number variants (CNVs) unobservable in the underlying single nucleotide polymorphism (SNP) data.

Resources in this dataset:

Resource Title: Supplementary Data.

File Name: Web Page, url: https://academic.oup.com/g3journal/article/7/7/2161/6053605#supplementary-data

Funding

USDA-NIFA: 2011-67003-30342

National Institutes of Health: 2R01 CA132897

National Institutes of Health: P20 GM103446

History

Data contact name

Wisser, Randall J.

Data contact email

rjw@udel.edu

Publisher

G3: Genes, Genomes, Genetics

Intended use

By capitalizing on the sequence-read structure of GBS data (stacks of reads), a new tool for resolving local haplotypes and scoring phased genotypes was developed, a feature that is not available in many GBS pipelines.

Temporal Extent Start Date

2017-07-01

Theme

Not specified

Geographic Coverage

{"type":"FeatureCollection","features":[{"geometry":{"type":"Polygon","coordinates":[[[-531.5625,-83.164278290951],[-531.5625,85.287916121237],[-161.71875,85.287916121237],[-161.71875,-83.164278290951],[-531.5625,-83.164278290951]]]},"type":"Feature","properties":{}}]}

ISO Topic Category

biota
farming

National Agricultural Library Thesaurus terms

genotyping by sequencing; genetic variation; corn; high-throughput nucleotide sequencing; genomic libraries; heterozygosity; loci; inbred lines; linkage disequilibrium; Zea mays; pedigree; filters; haplotypes; single nucleotide polymorphism; pipelines

OMB Bureau Code

005:18 - Agricultural Research Service

OMB Program Code

005:040 - National Research

Pending citation

Public Access Level

Public

Preferred dataset citation

Manching, Heather; Sengupta, Subhajit; Hopper, Keith R.; Polson, Shawn W.; Ji, Yuan; Wisser, Randall J. (2022). Data from: Phased Genotyping-by-Sequencing Enhances Analysis of Genetic Diversity and Reveals Divergent Copy Number Variants in Maize. G3: Genes, Genomes, Genetics. https://doi.org/10.1534/g3.117.042036

Usage metrics

Keywords

genome high-throughput sequencing genotypes Zea mays genotyping-by-sequencing data.gov ARS

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM