Back to Search
Start Over
The Classification Performance and Mechanism of Machine Learning Algorithms in Winter Wheat Mapping Using Sentinel-2 10 m Resolution Imagery
- Source :
- Applied Sciences, Vol 10, Iss 5075, p 5075 (2020), Applied Sciences, Volume 10, Issue 15
- Publication Year :
- 2020
- Publisher :
- MDPI AG, 2020.
-
Abstract
- Machine learning algorithms are crucial for crop identification and mapping. However, many works only focus on the identification results of these algorithms, but pay less attention to their classification performance and mechanism. In this paper, based on Google Earth Engine (GEE), Sentinel-2 10 m resolution images during a specific phenological period of winter wheat were obtained. Then, support vector machine (SVM), random forest (RF), and classification and regression tree (CART) machine learning algorithms were employed to identify and map winter wheat in a large-scale area. The hyperparameters of the three machine learning algorithms were tuned by grid search and the 5-fold cross-validation method. The classification performance of the three machine learning algorithms were compared, the results of which demonstrate that SVM achieves best performance in identifying winter wheat, and its overall accuracy (OA), user&rsquo<br />s accuracy (UA), producer&rsquo<br />s accuracy (PA), and kappa coefficient (Kappa) are 0.94, 0.95, 0.95, and 0.92, respectively. Moreover, 50 various combinations of training and validation sets were used to analyze the generalization ability of the algorithms, and the results show that the average OA of SVM, RF, and CART are 0.93, 0.92, and 0.88, respectively, thus indicating that SVM and RF are more robust than CART. To further explore the sensitivity of SVM, RF, and CART to variations of the algorithm parameters&mdash<br />namely, (C and gamma), (tree and split), and (maxD and minSP)&mdash<br />we employed the grid search method to iterate these parameters, respectively, and to analyze the effect of these parameters on the accuracy scores and classification residuals. It was found that with the change of (C and gamma) in (0.01~1000), SVM&rsquo<br />s maximum variation of accuracy score is up to 0.63, and the maximum variation of residuals is 76,215 km2. We concluded that SVM is sensitive to the parameters (C and gamma) and presents a positive correlation. When the parameters (tree and split) change between (100~600) and (1~6), respectively, the RF&rsquo<br />s maximum variation of accuracy score is 0.08, and the maximum variation of residuals is 1157 km2, indicating that RF is low in sensitivity toward the parameters (tree and split). When the parameters (maxD and minSP) are between (10~60), the maximum accuracy change value is 0.06, and the maximum variation of residuals is 6943 km2. Therefore, compared to RF, CART is sensitive to the parameters (maxD and minSP) and has poor robustness. In general, under the conditions of the hyperparameters, SVM and RF exhibit optimal classification performance, while CART has relatively inferior performance. Meanwhile, SVM, RF, and CART have different sensitivities toward the algorithm parameters<br />that is, SVM and CART are more sensitive to the algorithm parameters, while RF has low sensitivity toward changes in the algorithm parameters. The different parameters cause great changes in the accuracy scores and residuals, so it is necessary to determine the algorithm hyperparameters. Generally, default parameters can be used to achieve crop classification, but we recommend the enumeration method, similar to grid search, as a practical way to improve the classification performance of the algorithm if the best classification effect is expected.
- Subjects :
- 010504 meteorology & atmospheric sciences
0211 other engineering and technologies
Decision tree
02 engineering and technology
Machine learning
computer.software_genre
01 natural sciences
lcsh:Technology
lcsh:Chemistry
winter wheat mapping
Robustness (computer science)
machine learning algorithms
General Materials Science
Sensitivity (control systems)
Instrumentation
lcsh:QH301-705.5
021101 geological & geomatics engineering
0105 earth and related environmental sciences
Mathematics
Fluid Flow and Transfer Processes
Hyperparameter
business.industry
lcsh:T
Process Chemistry and Technology
General Engineering
lcsh:QC1-999
Computer Science Applications
Random forest
Support vector machine
Tree (data structure)
lcsh:Biology (General)
lcsh:QD1-999
lcsh:TA1-2040
large-scale
Hyperparameter optimization
classification performance
Artificial intelligence
Sentinel-2
business
lcsh:Engineering (General). Civil engineering (General)
computer
Algorithm
lcsh:Physics
Subjects
Details
- Language :
- English
- ISSN :
- 20763417
- Volume :
- 10
- Issue :
- 5075
- Database :
- OpenAIRE
- Journal :
- Applied Sciences
- Accession number :
- edsair.doi.dedup.....3cccd4c6c544caa584fdf47612cc1227