1. Machine Learning-Based Crop Yield Prediction in South India: Performance Analysis of Various Models.
- Author
-
Nikhil, Uppugunduri Vijay, Pandiyan, Athiya M., Raja, S. P., and Stamenkovic, Zoran
- Subjects
CROP yields ,CROPS ,AGRICULTURAL productivity ,FARM produce ,AGRICULTURE ,RICE quality ,CROP quality - Abstract
Agriculture is one of the most important activities that produces crop and food that is crucial for the sustenance of a human being. In the present day, agricultural products and crops are not only used for local demand, but globalization has allowed us to export produce to other countries and import from other countries. India is an agricultural nation and depends a lot on its agricultural activities. Prediction of crop production and yield is a necessary activity that allows farmers to estimate storage, optimize resources, increase efficiency and decrease costs. However, farmers usually predict crops based on the region, soil, weather conditions and the crop itself based on experience and estimates which may not be very accurate especially with the constantly changing and unpredictable climactic conditions of the present day. To solve this problem, we aim to predict the production and yield of various crops such as rice, sorghum, cotton, sugarcane and rabi using Machine Learning (ML) models. We train these models with the weather, soil and crop data to predict future crop production and yields of these crops. We have compiled a dataset of attributes that impact crop production and yield from specific states in India and performed a comprehensive study of the performance of various ML Regression Models in predicting crop production and yield. The results indicated that the Extra Trees Regressor achieved the highest performance among the models examined. It attained a R-Squared score of 0.9615 and showed lowest Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) of 21.06 and 33.99. Following closely behind are the Random Forest Regressor and LGBM Regressor, achieving R-Squared scores of 0.9437 and 0.9398 respectively. Moreover, additional analysis revealed that tree-based models, showing a R-Squared score of 0.9353, demonstrate better performance compared to linear and neighbors-based models, which achieved R-Squared scores of 0.8568 and 0.9002 respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF