Ag Data Commons
Browse
1/1
3 files

Agricultural Research Word Vectors

Download all (4.8 GB)
model
posted on 2024-02-15, 20:16 authored by Russell Brown, Siyi HuangSiyi Huang, Cynthia Parr

This model was originally trained for use in a recommendation system to the Ag Data Commons that will automatically link viewers of one dataset to other directly relevant datasets and research papers that they may be interested in. It was also used to determine the similarities and differences between projects within ARS’ National Programs and create a visualization layer to allow leaders to explore and manage their programs easily.

This model was generated using the Word2Vec model, starting with a set of word vectors trained on Google News articles, and further training it on the titles+abstracts from PubAg and the titles+descriptions from Ag Data Commons. This model was trained using a vector length of 300 and the Continuous Bag of Words version of the algorithm with negative sampling.

This word vector model could be used for any Natural-Language Processing applications involving text with a large amount of agricultural research vocabulary.


Resources in this dataset:

  • Resource Title: Agricultural Word Vectors.

    File Name: AgWordVectors-300.zip

    Resource Description: Word vectors trained on the full titles/abstracts in PubAg and titles/abstracts in Ag Data Commons. (Part A)


  • Resource Title: Agricultural Word Vectors Trainables.

    File Name: AgWordVectors-300.model.trainables.syn1neg.zip

    Resource Description: Word vectors trained on the full titles/abstracts in PubAg and titles/abstracts in Ag Data Commons. (Part B)


  • Resource Title: Agricultural Word Vector Model.

    File Name: AgWordVectors-300.model.wv_.vectors.zip

    Resource Description: Word vectors trained on the full titles/abstracts in PubAg and titles/abstracts in Ag Data Commons. (Part C)

Funding

USDA-ARS

History

Data contact name

Parr, Cynthia

Data contact email

cynthia.parr@usda.gov

Publisher

Ag Data Commons

Intended use

These word vectors can be used for NLP applications involving text with a large amount of agricultural vocabulary.

Theme

  • Not specified

ISO Topic Category

  • biota
  • environment
  • farming

National Agricultural Library Thesaurus terms

models; learning; artificial intelligence; computer simulation

OMB Bureau Code

  • 005:18 - Agricultural Research Service

OMB Program Code

  • 005:040 - National Research

Pending citation

  • No

Public Access Level

  • Public

Preferred dataset citation

Brown, Russell; Huang, Siyi; Parr, Cynthia (2019). Agricultural Research Word Vectors. Ag Data Commons. https://doi.org/10.15482/USDA.ADC/1506066

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC