U.S. flag

An official website of the United States government

Other Access

The information on this page (the dataset metadata) is also available in these formats:


via the DKAN API

Nearest Neighbor Soil Water Retention Estimator

The k-nearest neighbor (k-NN) technique is a non-parametric technique that can be used to make predictions of discrete (class-type) as well as continuous variables. The k-NN technique and many of its derivatives belong to the group of .lazy learning algorithms.. It is lazy, as it passively stores the development data set until the time of application; all calculations are performed only when estimations need to be generated.

The ability of soil to retain and to transmit water has to be known for many engineering, meteorological, agronomic, and hydrological applications. Measurements of soil hydraulic properties are costly and impractical for large scale projects. For such projects, soil hydraulic properties are estimated from publicly available basic soil data using statistical regression. Such regression equations have been developed in various regions of the World. The serious problem with these equations is that their applicability is unpredictable. We have empirically proven that information regarding soil properties can be used to reasonably predict hydraulic properties. The nearest-neighbor algorithm appears to be applicable to evaluate the similarity for this purpose. This algorithm has been coded in simple software to estimate water contents that are commonly associated with the ability of soil to hold water and with the dryness of soil causing wilting of plants. The substantial advantage of our software is that available local information on soil hydraulic properties can be easily incorporated in the similarity evaluation. The more that is known about local soil hydraulic properties, the better the estimation that can be obtained.

Technical Abstract: A computer tool has been developed that uses a k-Nearest Neighbor (k-NN) lazy learning algorithm to estimate soil water retention at –33 and –1500 kPa matric potentials and its uncertainty. The user can customize the provided source data collection to accommodate specific local needs. Ad hoc calculations make this technique a competitive alternative to published pedotransfer equations, as re-development of such equations is not needed when new data become available.

Release Date
United States Department of Agriculture
Contact Name
Pachepsky, Yakov
Contact Email
Public Access Level
Program Code
005:040 - Department of Agriculture - National Research
Bureau Code
005:18 - Agricultural Research Service