Intoduction Farmers must optimize crop yield in modern agriculture to ensure profitability and sustainability. Assessing soil conditions, including nitrogen, phosphorus, potassium levels, and pH values are crucial in determining the ideal crop for a specific field.
However, this process can be costly and time-consuming, prompting farmers to prioritize specific soil metrics based on budget constraints. To address this challenge, a machine learning approach is required to develop very good models capable of accurately predicting the optimal crop based on soil characteristics.
This project is submitted to the University of Essex as part of MA336 (Artificial intelligence and machine learning with applications) Coursework
Problem Statement This project aims to build and compare multiple classification algorithms to predict crop types using soil metrics and providing farmers with efficient decision-making tools for crop selection.
Objectives:
- Build and compare five (5) classification algorithms and select the best-performing model
- Using the identified best model, build an application that helps farmers make effective and efficient crop selection decisions
Methods:
- Data collection and Preparation: The dataset, obtained from DataCamp (https://www.datacamp.com/) containing the soil metrics and crop types will be loaded and read.
- Data Preprocessing: Missing values were checked, and categorical variables were encoded using Label Encoder.
- Exploratory Data Analysis (EDA): Exploratory data analysis was performed to identify relevant patterns, features, correlations, and their importance in predicting crop types.
- Model Selection: Various classification algorithms, such as Logistic Regression, Random Forests, Decision Trees, Gradient Boosting, and Support Vector Machines (SVM), which are suitable for multi-class classification tasks, were used.
- Model Training and Evaluation: Each Algorithm is trained using the training data, and the performance is evaluated using metrics such as accuracy, precisions, recall, and F1 score.
- Hyperparameter Tuning: To optimize the models' performance, the parameters of the selected algorithms were fine-tuned using the GridSearch method.
- Model Comparison: The performance of the different algorithms were compared and the best-performing model for predicting crop types based on the soil metrics in the dataset was identified
- Build a Crop-Predicting App: Finally, a crop-predicting app is built using the best-performing model.