Update README.md
Browse files
README.md
CHANGED
@@ -2,6 +2,8 @@
|
|
2 |
language: "en"
|
3 |
tags:
|
4 |
- sentiment
|
|
|
|
|
5 |
---
|
6 |
|
7 |
|
@@ -25,35 +27,35 @@ tokenizer = AutoTokenizer.from_pretrained("siebert/sentiment-roberta-large-engli
|
|
25 |
model = AutoModelForSequenceClassification.from_pretrained("siebert/sentiment-roberta-large-english")
|
26 |
```
|
27 |
|
28 |
-
|
29 |
-
|
30 |
# Performance
|
31 |
-
To evaluate the performance of our general-purpose sentiment analysis model, we set aside an evaluation set from each data set, which was not used for training. On average, our model outperforms a [DistilBERT-based model](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) (which is solely fine-tuned on the popular SST-2 data set) by more than
|
32 |
|
33 |
|Dataset|DistilBERT SST-2|This model|
|
34 |
|---|---|---|
|
35 |
-
|McAuley and Leskovec (2013) (Reviews)|84.7|
|
36 |
-
|McAuley and Leskovec (2013) (Review Titles)|65.5|87.
|
37 |
-
|Yelp Academic Dataset|84.8|
|
38 |
-
|Maas et al. (2011)|80.6|
|
39 |
|Kaggle|87.2|96.0|
|
40 |
|Pang and Lee (2005)|89.7|91.0|
|
41 |
-
|Nakov et al. (2013)|70.1|
|
42 |
-
|Shamma (2009)|76.0|
|
43 |
-
|Blitzer et al. (2007) (Books)|83.0|
|
44 |
-
|Blitzer et al. (2007) (DVDs)|84.5|
|
45 |
|Blitzer et al. (2007) (Electronics)|74.5|95.0|
|
46 |
-
|Blitzer et al. (2007) (Kitchen devices)|80.0|
|
47 |
-
|Pang et al. (2002)|73.5|
|
48 |
-
|Speriosu et al. (2011)|71.5|
|
49 |
|Hartmann et al. (2019)|65.5|98.0|
|
50 |
-
|**Average**|**78.1**|**
|
51 |
|
52 |
# Fine-tuning hyperparameters
|
53 |
- learning_rate = 2e-5
|
54 |
-
- batch_size = 8
|
55 |
-
- max_seq_length = 128
|
56 |
- num_train_epochs = 3.0
|
|
|
|
|
|
|
|
|
57 |
|
58 |
# Citation
|
59 |
Please cite [this paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489963) when you use our model.
|
|
|
2 |
language: "en"
|
3 |
tags:
|
4 |
- sentiment
|
5 |
+
- twitter
|
6 |
+
- reviews
|
7 |
---
|
8 |
|
9 |
|
|
|
27 |
model = AutoModelForSequenceClassification.from_pretrained("siebert/sentiment-roberta-large-english")
|
28 |
```
|
29 |
|
|
|
|
|
30 |
# Performance
|
31 |
+
To evaluate the performance of our general-purpose sentiment analysis model, we set aside an evaluation set from each data set, which was not used for training. On average, our model outperforms a [DistilBERT-based model](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) (which is solely fine-tuned on the popular SST-2 data set) by more than 15 percentage points (78.1 vs. 92.2, see table below). As a robustness check, we evaluate the model in a leave-on-out manner (training on 14 data sets, evaluating on the one left out), which decreases model performance by only about 3 percentage points on average and underscores its generalizability.
|
32 |
|
33 |
|Dataset|DistilBERT SST-2|This model|
|
34 |
|---|---|---|
|
35 |
+
|McAuley and Leskovec (2013) (Reviews)|84.7|98.0|
|
36 |
+
|McAuley and Leskovec (2013) (Review Titles)|65.5|87.0|
|
37 |
+
|Yelp Academic Dataset|84.8|96.5|
|
38 |
+
|Maas et al. (2011)|80.6|96.0|
|
39 |
|Kaggle|87.2|96.0|
|
40 |
|Pang and Lee (2005)|89.7|91.0|
|
41 |
+
|Nakov et al. (2013)|70.1|88.5|
|
42 |
+
|Shamma (2009)|76.0|87.0|
|
43 |
+
|Blitzer et al. (2007) (Books)|83.0|92.5|
|
44 |
+
|Blitzer et al. (2007) (DVDs)|84.5|92.5|
|
45 |
|Blitzer et al. (2007) (Electronics)|74.5|95.0|
|
46 |
+
|Blitzer et al. (2007) (Kitchen devices)|80.0|98.5|
|
47 |
+
|Pang et al. (2002)|73.5|95.5|
|
48 |
+
|Speriosu et al. (2011)|71.5|85.5|
|
49 |
|Hartmann et al. (2019)|65.5|98.0|
|
50 |
+
|**Average**|**78.1**|**93.2**|
|
51 |
|
52 |
# Fine-tuning hyperparameters
|
53 |
- learning_rate = 2e-5
|
|
|
|
|
54 |
- num_train_epochs = 3.0
|
55 |
+
- warmump_steps = 500
|
56 |
+
- weight_decay = 0.01
|
57 |
+
|
58 |
+
Other values were left at their defaults as listed [here](https://huggingface.co/transformers/main_classes/trainer.html#transformers.TrainingArguments).
|
59 |
|
60 |
# Citation
|
61 |
Please cite [this paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489963) when you use our model.
|