--- language: - 'no' - nb - nn license: cc-by-4.0 datasets: - ltg/norec_sentence pipeline_tag: text-classification --- # Sentence-level Sentiment Analysis model for Norwegian text This model is a fine-tuned version of [ltg/norbert3-base](https://huggingface.co/ltg/norbert3-base) for text classification. ## Training data The dataset used for fine-tuning is [ltg/norec_sentence](https://huggingface.co/datasets/ltg/norec_sentence), the `mixed` subset with four sentement categories: ``` [0]: Negative, [1]: Positive, [2]: Neutral [0,1]: Mixed ``` ## Quick start You can use this model for inference as follows: ``` >>> from transformers import pipeline >>> origin = "ltg/norbert3-base_sentence-sentiment" >>> pipe = transformers.pipeline( "text-classification", ... model = origin, ... trust_remote_code=origin.startswith("ltg/norbert3"), ... config= origin, ... tokenizer = AutoTokenizer.from_pretrained(origin) ... ) >>> preds = pipe(["Hans hese, litt såre stemme kler bluesen, men denne platen kommer neppe til å bli blant hans største kommersielle suksesser.", ... "Borten-regjeringen gjorde ikke jobben sin." ]) >>> for p in preds: ... print(p) ``` Output: ``` The model 'NorbertForSequenceClassification' is not supported for text-classification. Supported models are ['AlbertForSequenceClassification', ... {'label': 'Mixed', 'score': 0.9230353236198425} {'label': 'Negative', 'score': 0.7348112463951111} ``` ## Training hyperparameters - per_device_train_batch_size: 16 - learning_rate: 1e-05 - gradient_accumulation_steps: 1 - num_train_epochs: 10 (best epoch 5) ## Evaluation | Category | F1 | | |:----------------|---------:|----:| | Negative_F1 | 0.580247 | | | Positive_F1 | 0.781699 | | | Neutral_F1 | 0.825197 | | | Mixed_F1 | 0.648649 | | | Weighted_avg_F1 | 0.763806 | |