Data and code from: Synergistic soil, land use, and climate influences on wind erosion on the Colorado Plateau: Implications for management

[ 2023-03-06 - Superseded by version 2, https://doi.org/10.15482/USDA.ADC/1528719 ]

Includes code and data to recreate analysis from the manuscript:

Nauman, T. W., Munson, S. M., Dhital, S., Webb, N. P., & Duniway, M. C. (2023). Synergistic soil, land use, and climate influences on wind erosion on the Colorado Plateau: Implications for management. Science of The Total Environment (p. 164605). https://doi.org/10.1016/j.scitotenv.2023.164605.

This includes R statistical code, aeolian monitoring data and associated soil, land use, and climate explanatory data for each site, and a raster map showing areas modeled to have more sediment transport.

Monitoring Data

Aeolian sediment horizontal mass flux (q, a proxy for potential wind erosion activity) measurements are recorded for 81 sites that are collected three times per year (Feb-March, June-July, and Oct-Nov.). For each collection data is summarized in the BSNE_Samples_RegrMatrix.* files (.txt is tab delimited, and .rds is an r archive file). These tables also include the associated land use descriptions determined from field visits and local land policy. All spatial datasets are also summarized in this table for each site. Static maps of topography and soils are simply extracted for each site and attached to all collections taken at a given site. Spatial data that is available for different time periods is summarized by summarizing extracted values for a given variable for the period of time matching the q collection period (e.g. mean windspeed of the site). A number of statistical summaries are used for the time varying variables which are documented in the BSNE_Samples_RegrMatrix_ColumnDescriptions.xlsx file.


Random Forest Data Reduction

A random forest data reduction strategy was used as the first step to narrowing down potential wind erosion drivers in analysis. The merge_rfe_figs.R file includes all steps to reduce the number of variables considered for final model building that is done using linear mixed models in the next section. Some of the figures included in the paper looking at relationships between q and explanatory variables are also implemented in this script. Also included in the dataset are the caret recursive feature elimination object created in the script (rf.RFE_flux.rds), and two successive iterations of further pruned random forests created in the script (rf_pruned_flux.rds and rf_pruned2_flux.rds).

Linear Mixed Models

Linear mixed models were trained and ranked by a small sample size Akaike's Information Criterium to rank models. The LMMs_lme_flux.R file documents the process of training, ranking and interpretation of models. The highest-ranking models were interpreted by reporting slope estimates and effects sizes calculations. Interactions between explanatory variables were visualized using effect plots for the high ranking models.

Mapping erosion potential

After assessing model controls in the previous two sections, a conclusion was made that much of the variation in q could be represented by just the spatial data sources collected for the study. A random forest model was built for just important spatial variables that could then be rendered out to every 30-meter pixel in the study region. The rf_mapping_andFigs.R file documents the process of building the spatial model, rendering prediction maps, tabulating variable importances for the model, and plotting partial variable dependence plots to interpret model relationships. Also included from this script are the caret recursive feature elimination object (srf.RFE_flux.rds) and final pruned random forest model object (srf.pruned_flux.rds) used to predict q. Raster layers for each explanatory variable are provided for the summer 2018 collection used for making the map and are available in the finallayers_sum18.zip file with each raster filename matching the column names documented in BSNE_Samples_RegrMatrix_ColumnDescriptions.xlsx.

Erosion prediction map data

100cm_flux_sum18.* : Geotiff file of predicted q values across the study region.

flux_map.qgz : QGIS project file with pre-formatted visualization of the predicted q values.

Release Date
Not Planned
Spatial / Geographical Coverage Area
POLYGON ((-111.4453125 40.787820187396, -109.248046875 40.647303562523, -107.1826171875 40.279525668813, -107.40234375 38.788345355086, -107.841796875 34.89381606312, -112.32421875 35.396886504016, -111.62109375 39.035186251066, -111.62109375 39.035186251066))
Ag Data Commons
Spatial / Geographical Coverage Location
Colorado Plateau
Temporal Coverage
July 1, 2017 to November 30, 2020
Contact Name
Nauman, Travis
Contact Email
Public Access Level
Program Code
005:040 - Department of Agriculture - National Research
Bureau Code
005:53 - Natural Resources Conservation Service