Commit
·
3c511aa
1
Parent(s):
7a291d7
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,20 @@
|
|
1 |
---
|
2 |
license: cc-by-4.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-4.0
|
3 |
---
|
4 |
+
|
5 |
+
# Hatemoji Model
|
6 |
+
|
7 |
+
## Model description
|
8 |
+
|
9 |
+
This model is a fine-tuned version of the [DeBERTa base model](https://huggingface.co/microsoft/deberta-base). This model is cased. The model was trained on iterative rounds of adversarial data generation with human-and-model-in-the-loop. Each round of data has emoji-containing statements which are either non-hateful (LABEL 0.0) or hateful (LABEL 1.0).
|
10 |
+
- **Data Repository:** https://github.com/HannahKirk/Hatemoji
|
11 |
+
- **Paper:** https://arxiv.org/abs/2108.05921
|
12 |
+
- **Point of Contact:** hannah.kirk@oii.ox.ac.uk
|
13 |
+
|
14 |
+
## Intended uses & limitations
|
15 |
+
The intended use of the model is to classify English-language, emoji-containing, short-form text documents as a binary task: non-hateful vs hateful. The model has demonstrated strengths compared to commercial and academic models on classifying emoji-based hate, but is also a strong classifier of text-only hate. Because the model was trained on synthetic, adversarially-generated data, it may have some weaknesses when it comes to empirical emoji-based hate 'in-the-wild'.
|
16 |
+
|
17 |
+
## How to use
|
18 |
+
|
19 |
+
## Training data
|
20 |
+
The model was trained on [HatemojiBuild](https://huggingface.co/datasets/HannahRoseKirk/HatemojiBuild), alongside the four rounds of text-only adversarial data from Vidgen, B., Thrush, T., Waseem, Z., & Kiela, D. (2020). Learning from the worst: Dynamically generated datasets to improve online hate detection. arXiv
|