File size: 1,999 Bytes
3234e25 3e8796b 99fceca 3e8796b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
language:
- 'no'
- nb
- nn
license: cc-by-4.0
datasets:
- ltg/norec_sentence
pipeline_tag: text-classification
---
# Sentence-level Sentiment Analysis model for Norwegian text
This model is a fine-tuned version of [ltg/norbert3-base](https://huggingface.co/ltg/norbert3-base) for text classification.
## Training data
The dataset used for fine-tuning is [ltg/norec_sentence](https://huggingface.co/datasets/ltg/norec_sentence), the `mixed` subset with four sentement categories:
```
[0]: Negative,
[1]: Positive,
[2]: Neutral
[0,1]: Mixed
```
## Quick start
You can use this model for inference as follows:
```
>>> from transformers import pipeline
>>> origin = "ltg/norbert3-large_sentence-sentiment"
>>> pipe = transformers.pipeline( "text-classification",
... model = origin,
... trust_remote_code=origin.startswith("ltg/norbert3"),
... config= origin,
... tokenizer = AutoTokenizer.from_pretrained(origin)
... )
>>> preds = pipe(["Hans hese, litt såre stemme kler bluesen, men denne platen kommer neppe til å bli blant hans største kommersielle suksesser.",
... "Borten-regjeringen gjorde ikke jobben sin." ])
>>> for p in preds:
... print(p)
```
Output:
```
The model 'NorbertForSequenceClassification' is not supported for text-classification. Supported models are ['AlbertForSequenceClassification', ...
{'label': 'Mixed', 'score': 0.7435498237609863}
{'label': 'Negative', 'score': 0.765734851360321}
```
## Training hyperparameters
- per_device_train_batch_size: 32
- learning_rate: 1e-05
- gradient_accumulation_steps: 1
- num_train_epochs: 10 (best epoch 2)
## Evaluation
| Category | F1 | |
|:----------------|---------:|----:|
| Negative_F1 | 0.670241 |<img width=400/> |
| Positive_F1 | 0.832918 | |
| Neutral_F1 | 0.850082 | |
| Mixed_F1 | 0.580645 | |
| Weighted_avg_F1 | 0.799663 | |
|