DNgigi's picture
Update README.md
5a38c5e verified
---
license: mit
datasets:
- custom
metrics:
- mean_squared_error
- mean_absolute_error
- r2_score
model_name: Fertilizer Recommendation System
tags:
- random-forest
- regression
- multioutput
- classification
- agriculture
- soil-nutrients
---
# Fertilizer Application Recommendation System
## Overview
This model predicts the fertilizer requirements for various crops based on input features such as crop type, target yield, field size, and soil properties. It utilizes a combination of Random Forest Regressor and Random Forest Classifier to predict both numerical values (e.g., nutrient needs) and categorical values (e.g., fertilizer application instructions).
## Training Data
The model was trained on a custom dataset containing the following features:
- Crop Name
- Target Yield
- Field Size
- pH (water)
- Organic Carbon
- Total Nitrogen
- Phosphorus (M3)
- Potassium (exch.)
- Soil moisture
The target variables include:
**Numerical Targets**:
- Nitrogen (N) Need
- Phosphorus (P2O5) Need
- Potassium (K2O) Need
- Organic Matter Need
- Lime Need
- Lime Application - Requirement
- Organic Matter Application - Requirement
- 1st Application - Requirement (1)
- 1st Application - Requirement (2)
- 2nd Application - Requirement (1)
**Categorical Targets**:
- Lime Application - Instruction
- Lime Application
- Organic Matter Application - Instruction
- Organic Matter Application
- 1st Application
- 1st Application - Type fertilizer (1)
- 1st Application - Type fertilizer (2)
- 2nd Application
- 2nd Application - Type fertilizer (1)
## Model Training
The model was trained using the following steps:
1. **Data Preprocessing**:
- Handling missing values
- Scaling numerical features using `StandardScaler`
- One-hot encoding categorical features
2. **Modeling**:
- Splitting the dataset into training and testing sets
- Training a `RandomForestRegressor` for numerical targets using a `MultiOutputRegressor`
- Training a `RandomForestClassifier` for categorical targets using a `MultiOutputClassifier`
3. **Evaluation**:
- Evaluating the models using the test set with metrics like Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R2) Score for regression, and accuracy for classification.
## Evaluation Metrics
The model was evaluated using the following metrics:
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- R-squared (R2) Score
- Accuracy for categorical targets
## How to Use
### Input Format
The model expects input data in JSON format with the following fields:
- "Crop Name": String
- "Target Yield": Numeric
- "Field Size": Numeric
- "pH (water)": Numeric
- "Organic Carbon": Numeric
- "Total Nitrogen": Numeric
- "Phosphorus (M3)": Numeric
- "Potassium (exch.)": Numeric
- "Soil moisture": Numeric
### Preprocessing Steps
This script includes:
Loading the models and preprocessor.
Defining the categorical and numerical targets.
Loading the label encoders.
Creating a function make_predictions that processes the input data, makes predictions, and decodes the categorical predictions.
### Inference Procedure
```python
import pandas as pd
from joblib import load
from huggingface_hub import hf_hub_download
from sklearn.preprocessing import LabelEncoder
# Load models and preprocessor
preprocessor_path = hf_hub_download(repo_id='DNgigi/FertiliserApplication', filename='preprocessor.joblib')
numerical_model_path = hf_hub_download(repo_id='DNgigi/FertiliserApplication', filename='numerical_model.joblib')
categorical_model_path = hf_hub_download(repo_id='DNgigi/FertiliserApplication', filename='categorical_model.joblib')
preprocessor = load(preprocessor_path)
numerical_model = load(numerical_model_path)
categorical_model = load(categorical_model_path)
# Define categorical targets
categorical_targets = [
'Lime Application - Instruction',
'Lime Application',
'Organic Matter Application - Instruction',
'Organic Matter Application',
'1st Application',
'1st Application - Type fertilizer (1)',
'1st Application - Type fertilizer (2)',
'2nd Application',
'2nd Application - Type fertilizer (1)',
'1st Application_1',
'1st Application - Type fertilizer (1)_3',
'1st Application - Type fertilizer (2)_5',
'2nd Application_6',
'1st Application_21',
'1st Application - Type fertilizer (1)_23',
'1st Application - Type fertilizer (2)_25',
'2nd Application_26',
'2nd Application - Type fertilizer (1)_28'
]
# Define numerical targets
numerical_targets = [
'Nitrogen (N) Need',
'Phosphorus (P2O5) Need',
'Potassium (K2O) Need',
'Organic Matter Need',
'Lime Need',
'Lime Application - Requirement',
'Organic Matter Application - Requirement',
'1st Application - Requirement (1)',
'1st Application - Requirement (2)',
'2nd Application - Requirement (1)'
]
# Load label encoders
label_encoders = {col: load(hf_hub_download(repo_id='DNgigi/FertiliserApplication', filename=f'label_encoder_{col}.joblib')) for col in categorical_targets}
def make_predictions(input_data):
# Convert input data to DataFrame
input_df = pd.DataFrame([input_data])
# Preprocess the input data
X_transformed = preprocessor.transform(input_df)
# Predict with numerical model
numerical_predictions = numerical_model.predict(X_transformed)
# Predict with categorical model
categorical_predictions_encoded = categorical_model.predict(X_transformed)
# Decode categorical predictions
categorical_predictions_decoded = {}
for i, col in enumerate(categorical_targets):
le = label_encoders[col]
try:
categorical_predictions_decoded[col] = le.inverse_transform(categorical_predictions_encoded[:, i])
except ValueError as e:
categorical_predictions_decoded[col] = ["Unknown"] * len(categorical_predictions_encoded[:, i])
# Combine numerical and categorical predictions into a dictionary
predictions_combined = {col: numerical_predictions[0, i] for i, col in enumerate(numerical_targets)}
predictions_combined.update({col: categorical_predictions_decoded[col][0] for col in categorical_targets})
return predictions_combined
# Example usage
input_data = {
'Crop Name': 'maize(corn)',
'Target Yield': 3600.0,
'Field Size': 1.0,
'pH (water)': 6.1,
'Organic Carbon': 11.4,
'Total Nitrogen': 1.1,
'Phosphorus (M3)': 1.8,
'Potassium (exch.)': 3.0,
'Soil moisture': 20.0
}
predictions = make_predictions(input_data)
print("Predicted Fertilizer Requirements:")
for col, pred_value in predictions.items():
print(f"{col}: {pred_value}")