File size: 5,523 Bytes
7614c9c fa8ebd1 7614c9c fa8ebd1 7614c9c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
---
license: apache-2.0
datasets:
- poem_sentiment
language:
- en
metrics:
- Accuracy, F1 score
library_name: transformers
pipeline_tag: text-classification
tags:
- text-classification
- sentiment-analysis
- poem-sentiment-detection
- poem-sentiment
- poem-sentiment-classification
- sentiment-classification
widget:
- text: >-
Rapidly, merrily, Life's sunny hours flit by, Gratefully, cheerily, Enjoy them as they fly!
example_title: "Life"
- text: It so happens I am sick of my feet and my nails, and my hair and my shadow. It so happens I am sick of being a man.
example_title: "Walking Around"
- text: >-
No man is an island, Entire of itself, Every man is a piece of the continent, A part of the main.
example_title: "No man is an island"
- text: >-
Some have won a wild delight, By daring wilder sorrow; Could I gain thy love to-night, I'd hazard death to-morrow.
example_title: "Passion"
---
## AiManatee/RoBERTa_poem_sentiment
This model is a fine-tuned version of the [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) transformer for the task of poem sentiment analysis. It predicts the sentiment of a given poem verse into one of four categories: negative, positive, no impact, or mixed (positive and negative).
### Dataset
RoBERTa_poem_sentiment was trained on the [poem_sentiment](https://huggingface.co/datasets/poem_sentiment) dataset which consists of poem verses across four sentiment labels: negative, positive, no impact, and mixed sentiment. However, the Validation and Test subsets of the original dataset lack 'mixed' sentiment examples. To address this and ensure a thorough evaluation, data augmentation was performed: 32 'mixed' sentiment verses from different English poems were added to the Validation (16) and Test (16) subsets; the original Train subset remained intact. All the augmented samples were tested for semantic consistency, diversity (cosine similarity), length variation and novelty (ensuring the augmented data introduced new, relevant vocabulary). This strategy allowed for a more comprehensive evaluation of the model's generalization ability across all trained labels. The final model was tested on both the original dataset and the augmented dataset.
#### Labels
```
{0: 'negative', 1: 'positive', 2: 'no_impact', 3: 'mixed'}
```
### Training Hyperparameters
```
learning_rate: 2e-5,
weight_decay: 0.01,
batch_size: 16,
num_epochs: 8,
optimizer: AdamW: betas=(0.9, 0.999), eps=1e-08
seed: 16
early_stopper: min_delta=0.001, patience=3
```
```
scheduler = ReduceLROnPlateau(
optimizer,
mode="min",
factor=0.5,
patience=0,
threshold=0.001,
eps=1e-8,
)
```
### Model Performance
##### Validation results on the original dataset (class 3 is not being evaluated here)
| Epoch | Training Loss | Validation Loss | Accuracy | F1 |
|-------|---------------|-----------------|----------|----------|
| 1 | 1.365169 | 1.010353 | 0.761905 | 0.771733 |
| 2 | 0.860945 | 0.810045 | 0.723810 | 0.740809 |
| 3 | 0.570005 | 0.637439 | 0.761905 | 0.802184 |
| 4 | 0.355776 | 0.699637 | 0.780952 | 0.797572 |
| 5 | 0.252919 | 0.586395 | 0.847619 | 0.860519 |
| 6 | 0.156633 | 0.610439 | 0.819048 | 0.834072 |
| 7 | 0.084868 | 0.515130 | 0.876190 | 0.884736 |
| 8 | 0.062830 | 0.572643 | 0.885714 | 0.902510 |
##### Validation results on the augmented dataset
| Epoch | Training Loss | Validation Loss | Accuracy | F1 |
|-------|---------------|-----------------|----------|----------|
| 1 | 1.365169 | 1.168057 | 0.661157 | 0.628737 |
| 2 | 0.860945 | 0.869521 | 0.694214 | 0.717916 |
| 3 | 0.570005 | 0.637439 | 0.776859 | 0.790842 |
| 4 | 0.355776 | 0.681563 | 0.768595 | 0.776540 |
| 5 | 0.252919 | 0.585692 | 0.834710 | 0.841590 |
| 6 | 0.156633 | 0.542949 | 0.809917 | 0.815361 |
| 7 | 0.092444 | 0.581075 | 0.826446 | 0.830607 |
| 8 | 0.049480 | 0.583749 | 0.884297 | 0.881360 |
### How to Use the Model
Here is how to predict the sentiment of a poem verse using this model:
```python
from transformers import pipeline
sentiment_classifier = pipeline(task='text-classification', model='AiManatee/RoBERTa_poem_sentiment')
verse1 = "Rapidly, merrily, Life's sunny hours flit by, Gratefully, cheerily, Enjoy them as they fly!"
verse2 = "It so happens I am sick of my feet and my nails, and my hair and my shadow. It so happens I am sick of being a man."
verse3 = "No man is an island, Entire of itself, Every man is a piece of the continent, A part of the main."
verse4 = "Some have won a wild delight, By daring wilder sorrow; Could I gain thy love to-night, I'd hazard death to-morrow."
print(sentiment_classifier(verse1))
print(sentiment_classifier(verse2))
print(sentiment_classifier(verse3))
print(sentiment_classifier(verse4))
```
### Evaluation
##### Original dataset
```
{Loss: 0.5726433790155819
Accuracy: 0.8857142857142857
Precision: 0.9201298701298701
Recall: 0.8857142857142857
F1: 0.9025108225108224
}
```
##### Augmented dataset
```
{Loss: 0.5837492472492158
Accuracy: 0.8842975206611571
Precision: 0.8810538160090016
Recall: 0.8842975206611571
F1: 0.8813606847697756
}
```
### Framework Versions
- **Transformers:** 4.35.2
- **PyTorch:** 2.1.0+cu118
- **Datasets:** 2.16.1
- **Tokenizers:** 0.15.1 |