--- datasets: - google/jigsaw_toxicity_pred language: - en base_model: - FacebookAI/roberta-base metrics: - accuracy pipeline_tag: text-classification library_name: transformers --- # Model Card for Roberta-toxic **RoBERTa-toxic: A Robust Toxicity Prediction Model** RoBERTa-toxic leverages the power of the RoBERTa (Robustly Optimized BERT Pretraining Approach) transformer model to analyze text inputs and predict an array of toxicity categories. Fine-tuned for identifying nuanced toxic behaviors such as hate speech, harassment, profanity, and harmful stereotypes, it delivers accurate, context-aware predictions. The model is tailored for applications like content moderation, social media analysis, and safe online interactions, providing multi-label outputs for comprehensive toxicity profiling. ## Model Details ### Model Description - **Developed by:** ESIEA Students - **Shared by [optional]:** ESIEA Students - **Model type:** Roberta with additionnal layer to predict array of booleans - **Language(s) (NLP):** English - **Finetuned from model [optional]:** Roberta ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses The model can be used to classify text based on their toxicities ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data We did train the model on the googleJigSaw toxic dataset as mentionned above on the 150k comments [More Information Needed] ### Training Procedure we trained #### Preprocessing [optional] we only did some basic data-cleaning [More Information Needed] #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] training time 4hours on a gtx 1050TI GPU on 3 epochs [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics Accuracy of : **90%** [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** GTX 1050 TI - **Hours used:** 4 HOURS ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] We did use torch ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]