implementing logistic regression crop classification model using scikit learn
A multi‑class classification model that suggests the most suitable crop to sow in a field based on soil and environmental conditions.
| **Dataset** | Crop Recommendation Dataset — Kaggle |
| **Samples** | 2 ,200 rows × 7 features |
| **Target** | 22 crop labels (multinomial) |
| **Model** | Logistic Regression (multinomial) |
| Column | Description |
|---|---|
N, P, K |
Soil macro‑nutrient values (kg / ha) |
temperature |
°C |
humidity |
% |
ph |
Soil pH |
rainfall |
mm |
Target → label |
Crop name (22 classes) |
Key visualisations include:
- Rainfall vs. crop – box‑plots reveal rainfall requirements
- pH vs. crop – violin‑plots show acidic / alkaline preferences
- Pair‑plot – relationship between N, P, K & pH coloured by crop
- Class distribution – count‑plot ensures no severe imbalance
(see notebook for charts)
- Encoded crop names →
label_encodedusingLabelEncoder - Train‑test split 80 / 20 with
random_state = 42 - No missing values → dataset ready for modelling
model = LogisticRegression(
multi_class="multinomial",
solver="lbfgs",
max_iter=500)
model.fit(X_train, y_train)