File size: 1,786 Bytes
804d1a3
 
 
 
 
 
 
 
 
 
 
ee9b7d7
804d1a3
b478740
804d1a3
 
 
 
 
 
 
 
 
 
 
 
 
 
c3e7d66
 
804d1a3
0d17e62
69a8a5e
 
 
0d17e62
 
 
eaecd67
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
language: da
tags:
- danish
- bert
- sentiment
- polarity
license: cc-by-4.0
widget:
- text: "Sikke en dejlig dag det er i dag"
---
# Danish BERT fine-tuned for Sentiment Analysis with  `senda`

This model detects polarity ('positive', 'neutral', 'negative') of Danish texts.

It is trained and tested on Tweets annotated by [Alexandra Institute](https://github.com/alexandrainst). The model is trained with the [`senda`](https://github.com/ebanalyse/senda) package.

Here is an example of how to load the model in PyTorch using the [🤗Transformers](https://github.com/huggingface/transformers) library:

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
tokenizer = AutoTokenizer.from_pretrained("pin/senda")
model = AutoModelForSequenceClassification.from_pretrained("pin/senda")

# create 'senda' sentiment analysis pipeline 
senda_pipeline = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

text = "Sikke en dejlig dag det er i dag"
# in English: 'what a lovely day'
senda_pipeline(text)
```

## Performance 
The `senda` model achieves an accuracy of 0.77 and a macro-averaged F1-score of 0.73 on a small test data set, that [Alexandra Institute](https://github.com/alexandrainst/danlp/blob/master/docs/docs/datasets.md#twitter-sentiment) provides. The model can most certainly be improved, and we encourage all NLP-enthusiasts to give it their best shot - you can use the [`senda`](https://github.com/ebanalyse/senda) package to do this.

#### Contact
Feel free to contact author Lars Kjeldgaard on [lars.kjeldgaard@eb.dk](mailto:lars.kjeldgaard@eb.dk).

#### Shout-outs

Props to [Malte Højmark-Berthelsen](mailto:hjb@kmd.dk) for pretraining Danish BERT and helping out adding a TensorFlow backend for `senda`.