language: | |
- en | |
tags: | |
- toxic text classification | |
licenses: | |
- apache-2.0 | |
## Toxicity Classification Model | |
This model is trained for toxicity classification task using. The dataset used for training is the dataset by **Jigsaw** ( [Jigsaw 2020](https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification)). We split it into two parts and fine-tune a DistilBERT model ([DistilBERT base model (uncased) ](https://huggingface.co/distilbert-base-uncased)) on it. DistilBERT is a distilled version of the [BERT base model](https://huggingface.co/bert-base-uncased). It was introduced in this [paper](https://arxiv.org/abs/1910.01108). | |
## How to use | |
```python | |
from transformers import pipeline | |
text = "This was a masterpiece. Not completely faithful to the books, but enthralling from beginning to end. Might be my favorite of the three." | |
classifier = pipeline("text-classification", model="tensor-trek/distilbert-toxicity-classifier") | |
classifier(text) | |
``` | |
## License | |
[Apache 2.0](./LICENSE) |