AiManatee commited on
Commit
7614c9c
·
verified ·
1 Parent(s): e741d1c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +145 -0
README.md ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - poem_sentiment
5
+ language:
6
+ - en
7
+ metrics:
8
+ - Accuracy, F1 score
9
+ library_name: transformers
10
+ pipeline_tag: text-classification
11
+ tags:
12
+ - text-classification
13
+ - sentiment-analysis
14
+ - poem-sentiment-detection
15
+ - poem-sentiment
16
+ - poem-sentiment-classification
17
+ - sentiment-classification
18
+ widget:
19
+ - text: >-
20
+ Rapidly, merrily, Life's sunny hours flit by, Gratefully, cheerily, Enjoy them as they fly!
21
+ example_title: "Life"
22
+ - text: It so happens I am sick of my feet and my nails, and my hair and my shadow. It so happens I am sick of being a man.
23
+ example_title: "Walking Around"
24
+ - text: >-
25
+ No man is an island, Entire of itself, Every man is a piece of the continent, A part of the main.
26
+ example_title: "No man is an island"
27
+ - text: >-
28
+ Some have won a wild delight, By daring wilder sorrow; Could I gain thy love to-night, I'd hazard death to-morrow.
29
+ example_title: "Passion"
30
+ ---
31
+ ## AiManatee/RoBERTa_poem_sentiment
32
+ This model is a fine-tuned version of the [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) transformer for the task of poem sentiment analysis. It predicts the sentiment of a given poem verse into one of four categories: negative, positive, no impact, or mixed (positive and negative).
33
+
34
+ ### Dataset
35
+ RoBERTa_poem_sentiment was trained on the [poem_sentiment](https://huggingface.co/datasets/poem_sentiment) dataset which consists of poem verses across four sentiment labels: negative, positive, no impact, and mixed sentiment. However, the Validation and Test subsets of the original dataset lack 'mixed' sentiment examples. To address this and ensure a thorough evaluation, data augmentation was performed: 32 'mixed' sentiment verses from different English poems were added to the Validation (16) and Test (16) subsets; the original Train subset remained intact. All the augmented samples were tested for semantic consistency, diversity (cosine similarity), length variation and novelty (ensuring the augmented data introduced new, relevant vocabulary). This strategy allowed for a more comprehensive evaluation of the model's generalization ability across all trained labels. The final model was tested on both the original dataset and the augmented dataset.
36
+
37
+ #### Labels
38
+ ```
39
+ {0: 'negative', 1: 'positive', 2: 'no_impact', 3: 'mixed'}
40
+ ```
41
+
42
+ ### Training Hyperparameters
43
+ ```
44
+ learning_rate: 2e-5,
45
+ weight_decay: 0.01,
46
+ batch_size: 16,
47
+ num_epochs: 8,
48
+ optimizer: AdamW: betas=(0.9, 0.999), eps=1e-08
49
+ seed: 16
50
+ early_stopper: min_delta=0.001, patience=3
51
+ ```
52
+ ```
53
+ scheduler = ReduceLROnPlateau(
54
+ optimizer,
55
+ mode="min",
56
+ factor=0.5,
57
+ patience=0,
58
+ threshold=0.001,
59
+ eps=1e-8,
60
+ )
61
+ ```
62
+
63
+ ### Model Performance
64
+ ##### Validation results on the original dataset (class 3 is not being evaluated here)
65
+ | Epoch | Training Loss | Validation Loss | Accuracy | F1 |
66
+ |-------|---------------|-----------------|----------|----------|
67
+ | 1 | 1.365169 | 1.010353 | 0.761905 | 0.771733 |
68
+ | 2 | 0.860945 | 0.810045 | 0.723810 | 0.740809 |
69
+ | 3 | 0.570005 | 0.637439 | 0.761905 | 0.802184 |
70
+ | 4 | 0.355776 | 0.699637 | 0.780952 | 0.797572 |
71
+ | 5 | 0.252919 | 0.586395 | 0.847619 | 0.860519 |
72
+ | 6 | 0.156633 | 0.610439 | 0.819048 | 0.834072 |
73
+ | 7 | 0.084868 | 0.515130 | 0.876190 | 0.884736 |
74
+ | 8 | 0.062830 | 0.572643 | 0.885714 | 0.902510 |
75
+
76
+
77
+ ##### Validation results on the augmented dataset
78
+ | Epoch | Training Loss | Validation Loss | Accuracy | F1 |
79
+ |-------|---------------|-----------------|---------------------|
80
+ | 1 | 1.365169 | 1.168057 | 0.661157 | 0.628737 |
81
+ | 2 | 0.860945 | 0.869521 | 0.694214 | 0.717916 |
82
+ | 3 | 0.570005 | 0.643639 | 0.776859 | 0.790842 |
83
+ | 4 | 0.355776 | 0.681563 | 0.768595 | 0.776540 |
84
+ | 5 | 0.252919 | 0.585692 | 0.834710 | 0.841590 |
85
+ | 6 | 0.156633 | 0.542949 | 0.809917 | 0.815361 |
86
+ | 7 | 0.092444 | 0.581075 | 0.826446 | 0.830607 |
87
+ | 8 | 0.049480 | 0.583749 | 0.884297 | 0.881360 |
88
+
89
+
90
+
91
+ ### How to Use the Model
92
+ Here is how to predict the sentiment of a poem verse using this model:
93
+
94
+ ```python
95
+ from transformers import pipeline
96
+ sentiment_classifier = pipeline(task='text-classification', model='AiManatee/RoBERTa_poem_sentiment')
97
+ verse1 = "Rapidly, merrily, Life's sunny hours flit by, Gratefully, cheerily, Enjoy them as they fly!"
98
+ verse2 = "It so happens I am sick of my feet and my nails, and my hair and my shadow. It so happens I am sick of being a man."
99
+ verse3 = "No man is an island, Entire of itself, Every man is a piece of the continent, A part of the main."
100
+ verse4 = "Some have won a wild delight, By daring wilder sorrow; Could I gain thy love to-night, I'd hazard death to-morrow."
101
+ print(sentiment_classifier(verse1))
102
+ print(sentiment_classifier(verse2))
103
+ print(sentiment_classifier(verse3))
104
+ print(sentiment_classifier(verse4))
105
+ ```
106
+
107
+ ### Evaluation
108
+ ##### Original dataset
109
+ ```
110
+ {Loss: 0.5726433790155819
111
+ Accuracy: 0.8857142857142857
112
+ Precision: 0.9201298701298701
113
+ Recall: 0.8857142857142857
114
+ F1: 0.9025108225108224
115
+ }
116
+ ```
117
+
118
+ ##### Augmented dataset
119
+ ```
120
+ {Loss: 0.5837492472492158
121
+ Accuracy: 0.8842975206611571
122
+ Precision: 0.8810538160090016
123
+ Recall: 0.8842975206611571
124
+ F1: 0.8813606847697756
125
+ }
126
+ ```
127
+ ### Citation
128
+
129
+ If you find this model useful in your research, please consider citing:
130
+
131
+ ```bibtex
132
+ @misc{roberta_poem_sentiment,
133
+ author = {Your Name},
134
+ title = {RoBERTa for Poem Sentiment Analysis},
135
+ year = {2024},
136
+ publisher = {Hugging Face's Model Hub},
137
+ howpublished = {\url{https://huggingface.co//AiManatee/RoBERTa_poem_sentiment}},
138
+ }
139
+ ```
140
+
141
+ ### Framework Versions
142
+ - **Transformers:** 4.35.2
143
+ - **PyTorch:** 2.1.0+cu118
144
+ - **Datasets:** 2.16.1
145
+ - **Tokenizers:** 0.15.1