Commit
·
dda52b6
1
Parent(s):
1e97121
Update README.md
Browse files
README.md
CHANGED
@@ -49,17 +49,36 @@ python3 transformers/examples/pytorch/text-classification/run_glue.py \
|
|
49 |
```
|
50 |
|
51 |
We experimented with upsampling the train split of each round to improve performance with increments of [1, 5, 10, 100], with the optimum upsampling taken
|
52 |
-
forward to all subsequent rounds. The optimal upsampling ratios for R1-R4 (text rounds from Vidgen et al.,) are carried forward. This model is trained on upsampling ratios of `{'R0':1, 'R1':5, 'R2':100, 'R3':1, 'R4':1 , 'R5':100, 'R6':1, 'R7':5}
|
53 |
|
54 |
## Variable and metrics
|
55 |
-
We
|
56 |
* [HatemojiCheck](https://huggingface.co/datasets/HannahRoseKirk/HatemojiCheck), an evaluation checklist with 7 functionalities of emoji-based hate and contrast sets
|
57 |
* [HateCheck](https://huggingface.co/datasets/Paul/hatecheck), an evaluation checklist contains 29 functional tests for hate speech and contrast sets.
|
58 |
-
* The held-out tests sets from the three round of adversarially-generated data collection with emoji-containing examples (R5-7).
|
59 |
-
* The held-out test sets from the four rounds of adversarially-generated data collection with text-only examples (R1-4, from Vidgen et al.)
|
60 |
|
61 |
For the round-specific test sets, we used a weighted F1-score across them to choose the final model in each round. For more details, see our [paper](https://arxiv.org/abs/2108.05921)
|
62 |
|
63 |
## Evaluation results
|
64 |
-
|
|
|
|
|
|
|
|
|
65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
```
|
50 |
|
51 |
We experimented with upsampling the train split of each round to improve performance with increments of [1, 5, 10, 100], with the optimum upsampling taken
|
52 |
+
forward to all subsequent rounds. The optimal upsampling ratios for R1-R4 (text rounds from Vidgen et al.,) are carried forward. This model is trained on upsampling ratios of `{'R0':1, 'R1':5, 'R2':100, 'R3':1, 'R4':1 , 'R5':100, 'R6':1, 'R7':5}`.
|
53 |
|
54 |
## Variable and metrics
|
55 |
+
We wished to train a model which could effectively encode information about emoji-based hate, without worsening performance on text-only hate. Thus, we evaluate the model on:
|
56 |
* [HatemojiCheck](https://huggingface.co/datasets/HannahRoseKirk/HatemojiCheck), an evaluation checklist with 7 functionalities of emoji-based hate and contrast sets
|
57 |
* [HateCheck](https://huggingface.co/datasets/Paul/hatecheck), an evaluation checklist contains 29 functional tests for hate speech and contrast sets.
|
58 |
+
* The held-out tests sets from [HatemojiBuild](https://huggingface.co/datasets/HannahRoseKirk/HatemojiBuild) the three round of adversarially-generated data collection with emoji-containing examples (R5-7). Available on Huuggingface
|
59 |
+
* The held-out test sets from the four rounds of adversarially-generated data collection with text-only examples (R1-4, from [Vidgen et al.](https://github.com/bvidgen/Dynamically-Generated-Hate-Speech-Dataset))
|
60 |
|
61 |
For the round-specific test sets, we used a weighted F1-score across them to choose the final model in each round. For more details, see our [paper](https://arxiv.org/abs/2108.05921)
|
62 |
|
63 |
## Evaluation results
|
64 |
+
We compare our model in each iteration (R6-T, R7-T, R8-T) to:
|
65 |
+
* **P-IA**: the identity attack attribute from Perspective API
|
66 |
+
* **P-TX**: the toxicity attribute from Perspective API
|
67 |
+
* **B-D**: A BERT model trained on the [Davidson et al. (2017)](https://github.com/t-davidson/hate-speech-and-offensive-language) dataset
|
68 |
+
* **B-F**: A BERT model trained on the [Founta et al. (2018)](https://github.com/ENCASEH2020/hatespeech-twitter) dataset
|
69 |
|
70 |
+
| | **Emoji Test Sets** | | | | **Text Test Sets** | | | | **All Rounds** | |
|
71 |
+
| :------- | :-----------------: | :--------: | :------------: | :--------: | :----------------: | :----: | :-----------: | :--------: | :------------: | :--------: |
|
72 |
+
| | **R5-R7** | | **HmojiCheck** | | **R1-R4** | | **HateCheck** | | **R1-R7** | |
|
73 |
+
| | **Acc** | **F1** | **Acc** | **F1** | **Acc** | **F1** | **Acc** | **F1** | **Acc** | **F1** |
|
74 |
+
| **P-IA** | 0\.508 | 0\.394 | 0\.689 | 0\.754 | 0\.679 | 0\.720 | 0\.765 | 0\.839 | 0\.658 | 0\.689 |
|
75 |
+
| **P-TX** | 0\.523 | 0\.448 | 0\.650 | 0\.711 | 0\.602 | 0\.659 | 0\.720 | 0\.813 | 0\.592 | 0\.639 |
|
76 |
+
| **B-D** | 0\.489 | 0\.270 | 0\.578 | 0\.636 | 0\.589 | 0\.607 | 0\.632 | 0\.738 | 0\.591 | 0\.586 |
|
77 |
+
| **B-F** | 0\.496 | 0\.322 | 0\.552 | 0\.605 | 0\.562 | 0\.562 | 0\.602 | 0\.694 | 0\.557 | 0\.532 |
|
78 |
+
| **R6-T** | 0\.757 | **0\.769** | **0\.879** | **0\.910** | 0\.823 | 0\.837 | 0\.961 | 0\.971 | 0\.813 | 0\.825 |
|
79 |
+
| **R7-T** | **0\.759** | 0\.762 | 0\.867 | 0\.899 | 0\.824 | 0\.842 | 0\.955 | 0\.967 | 0\.813 | 0\.829 |
|
80 |
+
| **R8-T** | 0\.744 | 0\.755 | 0\.871 | 0\.904 | 0\.827 | 0\.844 | **0\.966** | **0\.975** | **0\.814** | **0\.829** |
|
81 |
+
|
82 |
+
For full discussion of the model results, see our [paper](https://arxiv.org/abs/2108.05921).
|
83 |
+
|
84 |
+
A recent [paper][https://arxiv.org/pdf/2202.11176.pdf] _A New Generation of Perspective API:Efficient Multilingual Character-level Transformers_ beats this model on the HatemojiCheck benchmark.
|