language: | |
- en | |
pipeline_tag: text-classification | |
base_model: cardiffnlp/twitter-roberta-base-2022-154m | |
model-index: | |
- name: twitter-roberta-base-hate-multiclass-latest | |
results: [] | |
# cardiffnlp/twitter-roberta-base-hate-multiclass-latest | |
This model is a fine-tuned version of [cardiffnlp/twitter-roberta-base-2022-154m](https://huggingface.co/cardiffnlp/twitter-roberta-base-2022-154m) for multiclass hate-speech classification. A combination of 13 different hate-speech datasets in the English language were used to fine-tune the model. | |
## Classes available | |
``` | |
{ | |
"sexism": 0, | |
"racism": 1, | |
"disability": 2, | |
"sexual_orientation": 3, | |
"religion": 4, | |
"other": 5, | |
"not_hate":6 | |
} | |
``` | |
## Following metrics are achieved | |
* Accuracy: 0.9419 | |
* Macro-F1: 0.5752 | |
* Weighted-F1: 0.9390 | |
### Usage | |
Install tweetnlp via pip. | |
```shell | |
pip install tweetnlp | |
``` | |
Load the model in python. | |
```python | |
import tweetnlp | |
model = tweetnlp.Classifier("cardiffnlp/twitter-roberta-base-hate-latest") | |
model.predict('Women are trash 2.') | |
>> {'label': 'sexism'} | |
model.predict('@user dear mongoloid respect sentiments & belief refrain totalitarianism. @user') | |
>> {'label': 'disability'} | |
``` | |
### Model based on: | |
``` | |
@misc{antypas2023robust, | |
title={Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation}, | |
author={Dimosthenis Antypas and Jose Camacho-Collados}, | |
year={2023}, | |
eprint={2307.01680}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.CL} | |
} | |
``` |