Video tutorial: Data Dictionaries on the Ag Data Commons
Data dictionaries are used to provide detailed information about the contents of a dataset or database, such as the names of measured variables, their data types or formats, and text descriptions. A data dictionary provides a concise guide to understanding and using the data. Ideally, all Ag Data Commons (ADC) records for datasets and databases should include or point to a data dictionary. It is preferred that these data dictionaries be machine readable, in csv format.
If your data are managed in a standard relational database you will likely be able to generate a data dictionary through your software. This will provide a document that is consistently formatted and contains what is needed for others to understand your data. See the following section for more information.
If your data are managed in spreadsheets, text files, or comma separated values, you will need to manually prepare a data dictionary. To support machine-readability, we recommend preparing your data dictionary as a spreadsheet. If you prefer to prepare it as a .doc or .pdf, we recommend embedding a data dictionary table in your document that can be easily extracted. A data dictionary template and examples can be found toward the end of this document.
-
If your data is stored in a relational database, it may be able to generate a data dictionary for you.
-
If your data is stored in a spreadsheet, you will need to manually create a data dictionary.
The following are recommended guidelines for data dictionaries; not requirements. These guidelines are subject to change, as best practices are evolving.