The statistical analysis of training data representativeness for artificial neural networks: spatial distribution modelling of heavy metals in topsoil.

Authors :: Sergeev, Aleksandr
Baglaeva, Elena
Shichkin, Andrey
Buevich, Alexander
Source :: Earth Science Informatics. Aug2024, Vol. 17 Issue 4, p3493-3509. 17p.
Publication Year :: 2024
Abstract: A four-step dividing algorithm of sampling points for artificial neural networks is presented to select a representative training subset for modelling the spatial distribution. The chromium and manganese contents in the topsoil in Tarko-Sale and Noyabrsk cities (Russian subarctic zone) were used as raw data. The spatial distributions of the content of elements in the topsoil layer were performed using a multilayer perceptron (MLP) with sigmoid and hyperbolic tangential activation functions. The root means squared error (RMSE) was calculated for each element content and area. The MLP with hyperbolic tangential activation function showed better accuracy for both subarctic cities and model areas. For Noyabrsk, the model with a hyperbolic tangent was about 10% better. For Tarko-Sale, the improvement of RMSE was around 200%. We have identified three classes of points: «elite», «middle», and «useless». Given this information when dividing the raw set, the accuracy of the models will increase. [ABSTRACT FROM AUTHOR]

Subjects :: *ARTIFICIAL neural networks
*URBAN renewal
*CITIES & towns
*HEAVY metals
*TOPSOIL
*MULTILAYER perceptrons

Full Text Access

Tools