Crop Yield Prediction Using Supervised Machine Learning Algorithms

Authors

  • Fouad Sleiby Palestine Ahliya University (Palestine)
  • Mutaz Rasmi Abu Sara Faculty of Engineering and Information Technology, Palestine Ahliya University (Palestine)
  • Mohammad Shakarnah Palestine Ahliya University (Palestine)
  • Emma Qumsiyeh Faculty of Engineering and Information Technology, Palestine Ahliya University (Palestine)
  • Murad Zeer Faculty of Engineering and Information Technology, Palestine Ahliya University (Palestine)

DOI:

https://doi.org/10.59994/ajbtme.2025.1.43

Keywords:

Crop Yield Prediction, Supervised Machine Learning, Data Preprocessing

Abstract

This paper discusses different supervised machine learning models in order to come up with a predictive model of crop yield, which utilizes the soil and environmental parameters. The data is a creation of the Kaggle in the project of Samrudha hackathon and is intended to support AI-based applications to smart and sustainable farming. We have done strict data preprocessing, which includes the elimination of outliers, the processes of duplicate and missing data. Various regression algorithms, such as K-Nearest Neighbors (KNN), Linear Regression, Ridge and Lasso Regression, Support Vector Regression (SVR), Decision Trees, and Random Forests were used and tested. R 2 and Mean Squared Error (MSE) were used as performance measurements. The Random Forest Regressor was the best performing model out of all the models tested with a test R 2= 0.9394 and a test MSE= 4.0840. This indicates the strength and capability of generalizing of ensemble approaches of agricultural yield forecasting activities. The originality of this study lies in its systematic and rigorous comparison of multiple supervised machine learning models for crop yield prediction using carefully preprocessed soil and environmental data. It further contributes by demonstrating the superior generalization capability of ensemble methods, particularly Random Forests, in supporting accurate and sustainable smart farming applications.

References

Abbas, F., Afzaal, H., Farooque, A. A., & Tang, S. (2020). Crop yield prediction through proximal sensing and machine learning algorithms. Agronomy, 10(7), 1046.

Agarwal, S., & Tarar, S. (2021). A hybrid approach for crop yield prediction using machine learning and deep learning algorithms. Journal of Physics: Conference Series, 1714(1), 012012.

Awad, M., & Khanna, R. (2015). Support vector machines for classification. In Efficient learning machines: Theories, concepts, and applications for engineers and system designers (pp. 39–66). Springer.

Champaneri, M., Chachpara, D., Chandvidkar, C., & Rathod, M. (2016). Crop yield prediction using machine learning. Technology, 9(38).

Iniyan, S., Varma, V. A., & Naidu, C. T. (2023). Crop yield prediction using machine learning techniques. Advances in Engineering Software, 175, 103326.

Jhajharia, K., Mathur, P., Jain, S., & Nijhawan, S. (2023). Crop yield prediction using machine learning and deep learning techniques. Procedia Computer Science, 218, 406–417.

Kushwah, J. S., Kumar, A., Patel, S., Soni, R., Gawande, A., & Gupta, S. (2022). Comparative study of regressor and classifier with decision tree using modern tools. Materials Today: Proceedings, 56, 3571–3576.

McDonald, G. C. (2009). Ridge regression. Wiley Interdisciplinary Reviews: Computational Statistics, 1(1), 93–100.

Medar, R., Rajpurohit, V. S., & Shweta, S. (2019). Crop yield prediction using machine learning techniques. 2019 IEEE 5th International Conference for Convergence in Technology (I2CT), 1–5.

Qumsiyeh, E., & Sabha, M. (2023). Utilizing Convolutional Neural Networks and KMeans Clustering for Efficient Plant Leaf Disease Detection. 2023 2nd International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI), 1–7.

Ranstam, J., & Cook, J. A. (2018). LASSO regression. Journal of British Surgery, 105(10), 1348–1348.

Reddy, D. J., & Kumar, M. R. (2021). Crop yield prediction using machine learning algorithm. 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), 1466–1470.

Segal, M. R. (2004). Machine learning benchmarks and random forest regression.

Su, X., Yan, X., & Tsai, C.-L. (2012). Linear regression. Wiley Interdisciplinary Reviews: Computational Statistics, 4(3), 275–294.

Downloads

Published

2025-05-31

How to Cite

Sleiby, F., Abu Sara, M. R., Shakarnah, M., Qumsiyeh, E., & Zeer, M. (2025). Crop Yield Prediction Using Supervised Machine Learning Algorithms. Ahliya Journal of Business Technology and MEAN Economies , 2(1), 43–52. https://doi.org/10.59994/ajbtme.2025.1.43

Issue

Section

Articles