Ag Data Commons
Browse
1/1
8 files

De novo transcriptome assembly and annotations for wheat curl mite (Aceria tosichella)

dataset
posted on 2024-02-15, 17:52 authored by Erin ScullyErin Scully, Adarsh K. Gupta, Nathan A. Palmer, Scott M. Geib, Gautam Sarath, Gary L. Hein, Satyanarayana Tatineni

To study the impact of wheat streak mosaic virus on global gene expression in wheat curl mite, we generated a de novo transcriptome assembly using 50 x 50 paired end reads from the Illumina HiSeq 2500. Reads were assembled using Trinity (version 2.0.6) and contigs greater than 200 nt were retained. All assembled transcripts were annotated using the Trinotate pipeline using blastp searches against the Swiss-prot/Uni-Prot database, blastx searches against the Swiss-prot/Uni-Prot databases, HMM searches against the Pfam-A database, blastp searches against the non-redundant protein database, and signalP and tmHMM predictions. To reduce noise from low abundance transcripts not well supported by the data, we filtered the assembly to retain only those transcripts with TPM values >=0.5.


Resources in this dataset:

  • Resource Title: Raw Trinity Assembly.

    File Name: Trinity.fasta.txt

    Resource Description: Raw trinity assembly obtained from wheat curl mite using 50 x 50 Illumina paired end reads from the HiSeq2500.

    Resource Software Recommended: Notepad++,url: https://notepad-plus-plus.org/


  • Resource Title: Raw Trinity Assembly.

    File Name: Trinity.fasta.txt

    Resource Description: Raw trinity assembly obtained from wheat curl mite using 50 x 50 Illumina paired end reads from the HiSeq2500.

    Resource Software Recommended: Text wrangler,url: https://itunes.apple.com/us/app/textwrangler/id404010395?mt=12


  • Resource Title: Trinotate annotations for raw Trinity assembly.

    File Name: trinotate_annotations_report.xls

    Resource Description: Trinotate results for raw wheat curl mite transcriptome assembly

    Resource Software Recommended: Excel,url: https://products.office.com/en-us/excel


  • Resource Title: Trinotate annotations for raw Trinity assembly.

    File Name: trinotate_annotations_report.xls

    Resource Description: Trinotate results for raw wheat curl mite transcriptome assembly

    Resource Software Recommended: Libre Office Calc,url: https://www.libreoffice.org/discover/calc/


  • Resource Title: Blastp results versus non-redundant protein database.

    File Name: wheat_curl_mite_blastp_nr.txt

    Resource Description: Blastp results for protein coding unigenes from raw Trinity transcriptome assembly (wheat curl mite). Output format is default.

    Resource Software Recommended: Notepad++,url: https://notepad-plus-plus.org/


  • Resource Title: Blastp results versus non-redundant protein database.

    File Name: wheat_curl_mite_blastpnr.txt

    Resource Description: Blastp results for protein coding unigenes from raw Trinity transcriptome assembly (wheat curl mite). Output format is default.

    Resource Software Recommended: Text wrangler,url: https://itunes.apple.com/us/app/textwrangler/id404010395?mt=12


  • Resource Title: Protein predictions for raw trinity transcriptome assembly (wheat curl mite).

    File Name: transcriptome.all.cds.pep.fasta.txt

    Resource Description: Putative coding regions were predicted using Transdecoder. Default parameters were used in conjunction with Pfam-A searches to identify putative open reading frames (ORFs).


  • Resource Title: Protein predictions for final transcriptome assembly (wheat curl mite).

    File Name: transcriptome.all.cds.pep.fasta.txt

    Resource Description: Protein coding regions were predicted using Transdecoder. ORFs were identified using default parameters in conjunction with Pfam-A searches.

    Resource Software Recommended: Notepad++,url: https://notepad-plus-plus.org/


  • Resource Title: Protein predictions for final transcriptome assembly (wheat curl mite).

    File Name: transcriptome.all.cds.pep.fasta.txt

    Resource Description: Protein coding regions were predicted using Transdecoder. ORFs were identified using default parameters in conjunction with Pfam-A searches.

    Resource Software Recommended: Text wrangler,url: https://itunes.apple.com/us/app/textwrangler/id404010395?mt=12


  • Resource Title: Final trinity transcriptome assembly for wheat curl mite.

    File Name: Trinity.mite.fasta.txt

    Resource Description: Transcripts less than 200 nt and transcripts with TPM values less than 0.5 were removed from the assembly. In addition, transcripts whose coding sequences had highest scoring blastp matches to microbes were also removed from the assembly.


  • Resource Title: Nucleotide coding regions for final transcriptome assembly for wheat curl mite.

    File Name: transcriptome.mite.cds.fasta.txt

    Resource Description: Nucleotide sequences corresponding to coding regions from the final transcriptome assembly for wheat curl mite. Open reading frames (ORFs) were predicted using transdecoder. Default parameters with the addition of the identification of Pfam-A domains was used for ORF identification.


  • Resource Title: Trinotate annotations for final Trinity assembly (wheat curl mite).

    File Name: trinotate.mite.xls

    Resource Description: Trinotate results for final wheat curl mite transcritpome assembly. Blastp and blastx searches against Swiss-Prot/Uni-Prot were performed along with Pfam-A searches using HMMER. Signal peptides and transmembrane domains were also identified.

    Resource Software Recommended: Excel,url: https://products.office.com/en-us/excel


  • Resource Title: Trinotate annotations for final Trinity assembly (wheat curl mite).

    File Name: trinotate.mite.xls

    Resource Description: Trinotate results for final wheat curl mite transcritpome assembly. Blastp and blastx searches against Swiss-Prot/Uni-Prot were performed along with Pfam-A searches using HMMER. Signal peptides and transmembrane domains were also identified.

    Resource Software Recommended: Libre Office Calc,url: https://www.libreoffice.org/discover/calc/

Funding

USDA-ARS: 5440-21000-033-00D

USDA-NIFA: 2013-68004-20358

History

Data contact name

Scully, Erin

Data contact email

Erin.Scully@ars.usda.gov

Publisher

Ag Data Commons

Intended use

The dataset was generated to compare the transcriptional responses of wheat curl mites exposed to wheat streak mosaic viruses with control mites. The purpose of this study was to gain a better understanding of how wheat streak mosaic virus alters reproduction and development of wheat curl mite.

Use limitations

The transcriptome assembly was generated from pools of mites feeding on virus infected plants or control plants that had been inoculated with water. Thus, this assembly will only contain genes that are expressed by mites during these conditions and may not necessarily represent a complete inventory of genes coded by wheat curl mite.

Theme

  • Not specified

ISO Topic Category

  • biota

National Agricultural Library Thesaurus terms

transcriptome; Aceria tosichella; Wheat streak mosaic virus; gene expression; databases; prediction; wheat; open reading frames; unigenes; data collection; transcription (genetics); reproduction; mites; plant diseases and disorders; plant viruses; viruses; inventories; sequence analysis; Eriophyidae

OMB Bureau Code

  • 005:18 - Agricultural Research Service

OMB Program Code

  • 005:040 - National Research

ARS National Program Number

  • 304
  • 301

Pending citation

  • No

Public Access Level

  • Public

Preferred dataset citation

Scully, Erin D.; Gupta, Adarsh K.; Palmer, Nathan A.; Geib, Scott M.; Sarath, Gautam; Hein, Gary L.; Tatineni, Satyanarayana (2018). De novo transcriptome assembly and annotations for wheat curl mite (Aceria tosichella). Ag Data Commons. https://doi.org/10.15482/USDA.ADC/1471685

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC