library_name: transformers
datasets:
- stanfordnlp/imdb
metrics:
- accuracy
tags:
- PyTorch
model-index:
- name: distilbert-imdb
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: imdb
type: imdb
args: plain_text
metrics:
- name: Accuracy
type: accuracy
value: 0.9316
pipeline_tag: text-classification
license: apache-2.0
language:
- en
distilbert-imdb
This is a fine-tuned version of distilbert-base-uncased on imdb dataset.
Performance
- Loss: 0.1958
- Accuracy: 0.932
How to Get Started with the Model
Use the code below to get started with the model:
from transformers import pipeline,DistilBertTokenizer
tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")
classifier = pipeline("sentiment-analysis", model="3oclock/distilbert-imdb", tokenizer=tokenizer)
result = classifier("I love this movie!")
print(result)
Model Details
Model Description
This is the model card for a fine-tuned 🤗 transformers model on the IMDb dataset.
- Developed by: Ge Li
- Model type: DistilBERT for Sequence Classification
- Language(s) (NLP): English
- License: [Specify License, e.g., Apache 2.0]
- Finetuned from model:
distilbert-base-uncased
Uses
Direct Use
This model can be used directly for sentiment analysis on movie reviews. It is best suited for classifying English-language text that is similar in nature to movie reviews.
Downstream Use [optional]
This model can be fine-tuned on other sentiment analysis tasks or adapted for tasks like text classification in domains similar to IMDb movie reviews.
Out-of-Scope Use
The model may not perform well on non-English text or text that is significantly different in style and content from the IMDb dataset (e.g., technical documents, social media posts).
Bias, Risks, and Limitations
Bias
The IMDb dataset primarily consists of English-language movie reviews and may not generalize well to other languages or types of reviews.
Risks
Misclassification in sentiment analysis can lead to incorrect conclusions in applications relying on this model.
Limitations
The model was trained on a dataset of movie reviews, so it may not perform as well on other types of text data.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model.