Commit
·
1e97121
1
Parent(s):
1355716
Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ license: cc-by-4.0
|
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
-
This model is a fine-tuned version of the [DeBERTa base model](https://huggingface.co/microsoft/deberta-base). This model is cased. The model was trained on iterative rounds of adversarial data generation with human-and-model-in-the-loop.
|
10 |
- **Data Repository:** https://github.com/HannahKirk/Hatemoji
|
11 |
- **Paper:** https://arxiv.org/abs/2108.05921
|
12 |
- **Point of Contact:** hannah.kirk@oii.ox.ac.uk
|
@@ -49,10 +49,17 @@ python3 transformers/examples/pytorch/text-classification/run_glue.py \
|
|
49 |
```
|
50 |
|
51 |
We experimented with upsampling the train split of each round to improve performance with increments of [1, 5, 10, 100], with the optimum upsampling taken
|
52 |
-
forward to all subsequent rounds. The optimal upsampling ratios for R1-R4 (text rounds from Vidgen et al.,) are carried forward. This model is trained on upsampling ratios of `{'R0':
|
53 |
|
54 |
## Variable and metrics
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
-
|
57 |
|
|
|
|
|
58 |
|
|
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
+
This model is a fine-tuned version of the [DeBERTa base model](https://huggingface.co/microsoft/deberta-base). This model is cased. The model was trained on iterative rounds of adversarial data generation with human-and-model-in-the-loop. In each round, annotators are tasked with tricking the model-in-the-loop with emoji-containing statements that it will misclassify. Between each round, the model is retrained. This is the final model from the iterative process, referred to as R8-T in our paper. The intended task is to classify an emoji-containing statement as either non-hateful (LABEL 0.0) or hateful (LABEL 1.0).
|
10 |
- **Data Repository:** https://github.com/HannahKirk/Hatemoji
|
11 |
- **Paper:** https://arxiv.org/abs/2108.05921
|
12 |
- **Point of Contact:** hannah.kirk@oii.ox.ac.uk
|
|
|
49 |
```
|
50 |
|
51 |
We experimented with upsampling the train split of each round to improve performance with increments of [1, 5, 10, 100], with the optimum upsampling taken
|
52 |
+
forward to all subsequent rounds. The optimal upsampling ratios for R1-R4 (text rounds from Vidgen et al.,) are carried forward. This model is trained on upsampling ratios of `{'R0':1, 'R1':5, 'R2':100, 'R3':1, 'R4':1 , 'R5':100, 'R6':1, 'R7':5}.
|
53 |
|
54 |
## Variable and metrics
|
55 |
+
We evaluate the model based on:
|
56 |
+
* [HatemojiCheck](https://huggingface.co/datasets/HannahRoseKirk/HatemojiCheck), an evaluation checklist with 7 functionalities of emoji-based hate and contrast sets
|
57 |
+
* [HateCheck](https://huggingface.co/datasets/Paul/hatecheck), an evaluation checklist contains 29 functional tests for hate speech and contrast sets.
|
58 |
+
* The held-out tests sets from the three round of adversarially-generated data collection with emoji-containing examples (R5-7).
|
59 |
+
* The held-out test sets from the four rounds of adversarially-generated data collection with text-only examples (R1-4, from Vidgen et al.)
|
60 |
|
61 |
+
For the round-specific test sets, we used a weighted F1-score across them to choose the final model in each round. For more details, see our [paper](https://arxiv.org/abs/2108.05921)
|
62 |
|
63 |
+
## Evaluation results
|
64 |
+
For full evaluation of the model, see our [paper](https://arxiv.org/abs/2108.05921).
|
65 |
|