|
---
|
|
license: mit
|
|
datasets:
|
|
- dair-ai/emotion
|
|
language:
|
|
- en
|
|
metrics:
|
|
- accuracy
|
|
- precision
|
|
- recall
|
|
- f1
|
|
base_model:
|
|
- google-bert/bert-base-uncased
|
|
pipeline_tag: text-classification
|
|
library_name: transformers
|
|
tags:
|
|
- emotion-classification
|
|
---
|
|
|
|
|
|
# BERT-Base-Uncased Emotion Classification Model
|
|
|
|
## Model Architecture
|
|
- **Base Model**: `bert-base-uncased`
|
|
- **Architecture**: Transformer-based model (BERT)
|
|
- **Fine-Tuned Task**: Emotion classification
|
|
- **Number of Labels**: 6 (sadness, joy, love, anger, fear, surprise)
|
|
|
|
## Dataset Information
|
|
The model was fine-tuned on the `dair-ai/emotion` dataset, which consists of English tweets classified into six emotion categories.
|
|
|
|
- **Training Dataset Size**: 16,000 examples
|
|
- **Validation Dataset Size**: 2,000 examples
|
|
- **Test Dataset Size**: 2,000 examples
|
|
- **Features**:
|
|
- `text`: The text of the tweet
|
|
- `label`: The emotion label for the text (ClassLabel: `['sadness', 'joy', 'love', 'anger', 'fear', 'surprise']`)
|
|
|
|
## Training Arguments
|
|
The model was trained using the following hyperparameters:
|
|
|
|
- **Learning Rate**: 2e-05
|
|
- **Batch Size**: 16
|
|
- **Number of Epochs**: 20 (stopped early at 7 epochs)
|
|
- **Gradient Accumulation Steps**: 2
|
|
- **Weight Decay**: 0.01
|
|
- **Mixed Precision (FP16)**: True
|
|
- **Early Stopping**: Enabled (see details below)
|
|
- **Logging**: Progress logged every 100 steps
|
|
- **Save Strategy**: Checkpoints saved at the end of each epoch, with the 3 most recent checkpoints retained
|
|
|
|
The model was trained for 7 epochs, but it stopped early due to early stopping. Early stopping was configured with the following details:
|
|
- **Patience**: 3 epochs (training stops if no improvement in F1 score for 3 consecutive evaluations)
|
|
- **Best Metric**: F1 score (greater is better)
|
|
- **Final Epoch**: The model was trained for 7 epochs (out of the planned 20) and stopped early due to no improvement in evaluation metrics.
|
|
|
|
## Final Training Metrics (After 7 Epochs)
|
|
The model achieved the following results on the validation dataset in the final epoch:
|
|
|
|
- **Accuracy**: 0.9085
|
|
- **Precision**: 0.8736
|
|
- **Recall**: 0.8962
|
|
- **F1 Score**: 0.8824
|
|
|
|
## Test Set Evaluation
|
|
After training, the model was evaluated on a held-out test set. The following are the results on the test dataset:
|
|
|
|
- **Test Accuracy**: 0.9180
|
|
- **Test Precision**: 0.8663
|
|
- **Test Recall**: 0.8757
|
|
- **Test F1 Score**: 0.8706
|
|
|
|
## Usage
|
|
|
|
You can load the model and tokenizer for inference using the Hugging Face `transformers` library with the `pipeline`:
|
|
|
|
```python
|
|
from transformers import pipeline
|
|
|
|
# Load the emotion classification pipeline
|
|
classifier = pipeline("text-classification", model='Prikshit7766/bert-base-uncased-emotion', return_all_scores=True)
|
|
|
|
# Test the classifier with a sample sentence
|
|
prediction = classifier("I am feeling great and happy today!")
|
|
|
|
# Print the predictions
|
|
print(prediction)
|
|
|
|
```
|
|
|
|
**Output**
|
|
|
|
```
|
|
[[{'label': 'sadness', 'score': 0.00010687233589123935},
|
|
{'label': 'joy', 'score': 0.9991187453269958},
|
|
{'label': 'love', 'score': 0.00041500659426674247},
|
|
{'label': 'anger', 'score': 7.090374856488779e-05},
|
|
{'label': 'fear', 'score': 5.2315706852823496e-05},
|
|
{'label': 'surprise', 'score': 0.0002362433006055653}]]
|
|
``` |