--- license: mit language: - multilingual - en - it - sl metrics: - f1 - accuracy base_model: FacebookAI/xlm-roberta-large pipeline_tag: text-classification tags: - hate-speech - xlm-roberta - Youtube - Twitter --- # Multilingual Hate Speech Classifier for Social Media with Disagreement-Aware Training A multilingual [XLM-R-based (100 languages)](https://huggingface.co/FacebookAI/xlm-roberta-large) hate speech classification model fine-tuned on English, Italian and Slovenian with inter-annotator disagreement-aware training. The details of the model and the disagreement-aware training are described in our [paper](https://www.researchgate.net/publication/384628421_Multilingual_Hate_Speech_Modeling_by_Leveraging_Inter-Annotator_Disagreement): @inproceedings{ grigor2024multilingual, title={Multilingual Hate Speech Modeling by Leveraging Inter-Annotator Disagreement}, author={Grigor, Patricia-Carla and Evkoski, Bojan and Kralj Novak, Petra}, url={http://dx.doi.org/10.70314/is.2024.sikdd.7}, DOI={10.70314/is.2024.sikdd.7}, booktitle={Proceedings of Data Mining and Data Warehouses – Sikdd 2024}, publisher={Jožef Stefan Instutute}, year={2024} } Authors: Patricia-Carla Grigor, Bojan Evkoski, Petra Kralj Novak Data available here: [English](https://www.clarin.si/repository/xmlui/handle/11356/1454); [Italian](https://www.clarin.si/repository/xmlui/handle/11356/1450); [Slovenian](https://www.clarin.si/repository/xmlui/handle/11356/1398) **Model output** The model classifies each input into one of four distinct classes: * 0 - appropriate * 1 - inappropriate * 2 - offensive * 3 - violent **Training data*** * 51k English Youtube comments * 60k Italian Youtube comments * 50k Slovenian Twitter comments **Evaluation data*** * 10k English Youtube comments * 10k Italian Youtube comments * 10k Slovenian Twitter comments \* each comment is manually labeled by two different annotators **Fine-tuning hyperparameters** num_train_epochs=3, train_batch_size=8, learning_rate=6e-6 **Evaluation Results** Model agreement (accuracy) vs. Inter-annotator agreement (0 - no agreement; 100 - perfect agreement): | | Model-annotator Agreement | Inter-annotator Agreement | |-----------|---------------------------|---------------------------| | English | 79.97 | 82.91 | | Italian | 82.00 | 81.79 | | Slovenian | 78.84 | 79.43 | Class-specific model F1-scores: | | Appropriate | Inappropriate | Offensive | Violent | |-----------|-------------|---------------|-----------|---------| | English | 86.10 | 39.16 | 68.24 | 27.82 | | Italian | 89.77 | 58.45 | 60.42 | 44.97 | | Slovenian | 84.30 | 45.22 | 69.69 | 24.79 | **Usage** from transformers import AutoModelForSequenceClassification, TextClassificationPipeline, AutoTokenizer, AutoConfig MODEL = "IMSyPP/hate_speech_multilingual" tokenizer = AutoTokenizer.from_pretrained(MODEL) config = AutoConfig.from_pretrained(MODEL) model = AutoModelForSequenceClassification.from_pretrained(MODEL) pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True, task='sentiment_analysis', device=0, function_to_apply="none") pipe([ "Thank you for using our model", "Grazie per aver utilizzato il nostro modello" "Hvala za uporabo našega modela" ])