--- language: en tags: - sentiment-analysis - transformers - pytorch license: apache-2.0 datasets: - custom-dataset metrics: - accuracy model_name: distilbert-base-uncased-finetuned-sentiment --- # DistilBERT Base Uncased Fine-tuned for Sentiment Analysis ## Model Description This model is a fine-tuned version of `distilbert-base-uncased` on a sentiment analysis dataset. It is trained to classify text into positive and negative sentiment categories. ## Training Details The model was fine-tuned on a sentiment analysis dataset using the Hugging Face `transformers` library. The training parameters are as follows: - **Learning Rate**: 2e-5 - **Batch Size**: 32 - **Number of Epochs**: 4 - **Optimizer**: AdamW - **Scheduler**: Linear with warmup - **Device**: Nvidia T4 GPU ## Training and Validation Metrics | Step | Training Loss | Validation Loss | Accuracy | |------|---------------|-----------------|----------| | 400 | 0.389300 | 0.181316 | 93.25% | | 800 | 0.161900 | 0.166204 | 94.13% | | 1200 | 0.114600 | 0.200135 | 94.30% | | 1600 | 0.076300 | 0.211609 | 94.40% | | 2000 | 0.041600 | 0.225439 | 94.45% | Final training metrics: - **Global Step**: 2000 - **Training Loss**: 0.156715 - **Training Runtime**: 1257.5696 seconds - **Training Samples per Second**: 50.892 - **Training Steps per Second**: 1.59 - **Total FLOPS**: 8477913513984000.0 - **Epochs**: 4.0 ## Model Performance The model achieves an accuracy of approximately 94.45% on the validation set. ## Usage To use this model for sentiment analysis, you can load it using the `transformers` library: ```python from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification model_name = 'luluw/distilbert-base-uncased-finetuned-sentiment' tokenizer = DistilBertTokenizerFast.from_pretrained(model_name) model = DistilBertForSequenceClassification.from_pretrained(model_name) # Example usage text = "I love this product!" inputs = tokenizer(text, return_tensors='pt') outputs = model(**inputs) predictions = torch.argmax(outputs.logits, dim=-1) ```