File size: 1,509 Bytes
1d1e5dc
85a326f
 
 
6017236
 
 
 
1d1e5dc
cf4aaa4
 
85a326f
cf4aaa4
85a326f
cf4aaa4
85a326f
 
 
 
 
 
 
 
 
 
 
 
cf4aaa4
85a326f
 
 
 
cf4aaa4
85a326f
 
 
 
 
 
 
 
 
 
 
 
 
cf4aaa4
85a326f
cf4aaa4
 
 
b9a303f
 
 
 
 
 
 
 
 
 
cf4aaa4
85a326f
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language:
- en
pipeline_tag: text-classification
base_model: cardiffnlp/twitter-roberta-base-2022-154m
model-index:
- name: twitter-roberta-base-hate-multiclass-latest
  results: []
---


# cardiffnlp/twitter-roberta-base-hate-multiclass-latest

This model is a fine-tuned version of [cardiffnlp/twitter-roberta-base-2022-154m](https://huggingface.co/cardiffnlp/twitter-roberta-base-2022-154m) for multiclass hate-speech classification. A combination of 13 different hate-speech datasets in the English language were used to fine-tune the model.

## Classes available
```
{
  "sexism": 0,
  "racism": 1,
  "disability": 2,
  "sexual_orientation": 3,
  "religion": 4,
  "other": 5,
  "not_hate":6
}
```

## Following metrics are achieved 
* Accuracy: 0.9419
* Macro-F1: 0.5752
* Weighted-F1: 0.9390

### Usage
Install tweetnlp via pip.
```shell
pip install tweetnlp
```
Load the model in python.
```python
import tweetnlp
model = tweetnlp.Classifier("cardiffnlp/twitter-roberta-base-hate-latest")
model.predict('Women are trash 2.')
>> {'label': 'sexism'}
model.predict('@user dear mongoloid respect sentiments & belief refrain totalitarianism. @user')
>> {'label': 'disability'}

```



### Model based on:
```
@misc{antypas2023robust,
      title={Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation}, 
      author={Dimosthenis Antypas and Jose Camacho-Collados},
      year={2023},
      eprint={2307.01680},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

```