Model Card for Model ID
Sentiment analysis for Norwegian reviews.
Model Description
This model is trained using a self-concatinated dataset consisting of Norwegian Review Corpus dataset (https://github.com/ltgoslo/norec) and a sentiment dataset from huggingface (https://huggingface.co/datasets/sepidmnorozy/Norwegian_sentiment). Its purpose is merely for testing.
- Developed by: Simen Aabol and Marcus Dragsten
- Finetuned from model: norbert2
Direct Use
Plug in Norwegian sentences to check its sentiment (negative to positive)
Training Details
Training and Testing Data
https://huggingface.co/datasets/marcuskd/reviews_binary_not4_concat
Preprocessing
Tokenized using:
tokenizer = AutoTokenizer.from_pretrained("ltgoslo/norbert2")
Training arguments for this model:
training_args = TrainingArguments(
output_dir='./results', # output directory
num_train_epochs=10, # total number of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=64, # batch size for evaluation
warmup_steps=500, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs', # directory for storing logs
logging_steps=10,
)
Evaluation
Evaluation by testing using test-split of dataset.
{
'accuracy': 0.8357214261912695,
'recall': 0.886873508353222,
'precision': 0.8789025543992431,
'f1': 0.8828700403896412,
'total_time_in_seconds': 94.33071640000003,
'samples_per_second': 31.81360340013276,
'latency_in_seconds': 0.03143309443518828
}
- Downloads last month
- 18
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.