--- datasets: - marcuskd/reviews_binary_not4_concat language: - 'no' - nb - nn metrics: - accuracy - recall - precision - f1 --- # Model Card for Model ID Sentiment analysis for Norwegian reviews. # Model Description This model is trained using a self-concatinated dataset consisting of Norwegian Review Corpus dataset (https://github.com/ltgoslo/norec) and a sentiment dataset from huggingface (https://huggingface.co/datasets/sepidmnorozy/Norwegian_sentiment). Its purpose is merely for testing. - **Developed by:** Simen Aabol and Marcus Dragsten - **Finetuned from model:** norbert2 # Direct Use Plug in Norwegian sentences to check its sentiment (negative to positive) # Training Details ## Training and Testing Data https://huggingface.co/datasets/marcuskd/reviews_binary_not4_concat ### Preprocessing Tokenized using: ```python tokenizer = AutoTokenizer.from_pretrained("ltgoslo/norbert2") ``` Training arguments for this model: ```python training_args = TrainingArguments( output_dir='./results', # output directory num_train_epochs=10, # total number of training epochs per_device_train_batch_size=16, # batch size per device during training per_device_eval_batch_size=64, # batch size for evaluation warmup_steps=500, # number of warmup steps for learning rate scheduler weight_decay=0.01, # strength of weight decay logging_dir='./logs', # directory for storing logs logging_steps=10, ) ``` # Evaluation Evaluation by testing using test-split of dataset. ```python { 'accuracy': 0.8357214261912695, 'recall': 0.886873508353222, 'precision': 0.8789025543992431, 'f1': 0.8828700403896412, 'total_time_in_seconds': 94.33071640000003, 'samples_per_second': 31.81360340013276, 'latency_in_seconds': 0.03143309443518828 } ```