Text2Text Generation
Transformers
Safetensors
mt5
Inference Endpoints
lmeribal commited on
Commit
2d449f4
·
verified ·
1 Parent(s): e02825a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -25,7 +25,7 @@ pipeline_tag: text2text-generation
25
  * [GitHub with training scripts and data](https://github.com/s-nlp/multilingual-transformer-detoxification)
26
 
27
  ## Model Information
28
- This is a multilingual 3.7B text detoxification model for 9 languages built on [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html) based on [mT0-xl](https://huggingface.co/bigscience/mt0-xl). The model was trained in a two-step setup: the first step is full fine-tuning on different parallel text detoxification datasets, and the second step is ORPO alignment on a self-annotated preference dataset collected using toxicity and similarity classifiers. See the paper for more details.
29
 
30
  In terms of human evaluation, the model is a second-best approach on the [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html). More precisely, the model shows state-of-the-art performance for the Ukrainian language, top-2 scores for Arabic, and near state-of-the-art performance for other languages.
31
 
 
25
  * [GitHub with training scripts and data](https://github.com/s-nlp/multilingual-transformer-detoxification)
26
 
27
  ## Model Information
28
+ This is a multilingual 3.7B text detoxification model for 9 languages built on [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html) based on [mT0-XL](https://huggingface.co/bigscience/mt0-xl). The model was trained in a two-step setup: the first step is full fine-tuning on different parallel text detoxification datasets, and the second step is ORPO alignment on a self-annotated preference dataset collected using toxicity and similarity classifiers. See the paper for more details.
29
 
30
  In terms of human evaluation, the model is a second-best approach on the [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html). More precisely, the model shows state-of-the-art performance for the Ukrainian language, top-2 scores for Arabic, and near state-of-the-art performance for other languages.
31