--- license: afl-3.0 language: - en - te metrics: - accuracy pipeline_tag: text-classification library_name: transformers tags: - toxic-comment-classification - roberta - text-classification --- # Toxic Comment Classification Using RoBERTa ## Overview This project provides a toxic comment classification model based on RoBERTa (Robustly optimized BERT approach). The model is designed to classify comments as toxic or non-toxic, helping in moderating online discussions and improving community interactions. ## Model Details - **Model Name**: RoBERTa for Toxic Comment Classification - **Architecture**: RoBERTa - **Fine-tuning Task**: Binary classification (toxic vs. non-toxic) - **Evaluation Metrics**: - Accuracy - F1 Score - Precision - Recall ## Files - `pytorch_model.bin`: The trained model weights. - `config.json`: Model configuration file. - `merges.txt`: BPE tokenizer merges file. - `model.safetensors`: Model weights in safetensors format. - `special_tokens_map.json`: Tokenizer special tokens mapping. - `tokenizer_config.json`: Tokenizer configuration file. - `vocab.json`: Tokenizer vocabulary file. - `roberta-toxic-comment-classifier.pkl`: Serialized best model state dictionary (for PyTorch). - `README.md`: This documentation file. ## Model Performance - **Accuracy**: 0.9599 - **F1 Score**: 0.9615 - **Precision**: 0.9646 - **Recall**: 0.9599 ## Load the model ``` from transformers import pipeline # Load the model and tokenizer model_name = "prabhaskenche/pk-toxic-comment-classification-using-RoBERTa" classifier = pipeline("text-classification", model=model_name) # Example usage text = "You're the worst person I've ever met." result = classifier(text) print(result) ``` ## Usage ### Installation Install the required packages: ```bash pip install torch transformers sklearn