pipeline_tag: zero-shot-classification
language:
- da
- 'no'
- nb
- sv
license: mit
datasets:
- strombergnlp/danfever
- KBLab/overlim
- MoritzLaurer/multilingual-NLI-26lang-2mil7
model-index:
- name: nb-bert-large-nli-scandi
results: []
widget:
- example_title: Danish
text: >-
Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke
finder dig'
candidate_labels: sundhed, politik, sport, religion
- example_title: Norwegian
text: >-
Regjeringen i Russland hevder Norge fører en politikk som vil føre til
opptrapping i Arktis og «den endelige ødeleggelsen av russisk-norske
relasjoner».
candidate_labels: helse, politikk, sport, religion
- example_title: Swedish
text: Så luras kroppens immunförsvar att bota cancer
candidate_labels: hälsa, politik, sport, religion
inference:
parameters:
hypothesis_template: Dette eksempel handler om {}
ScandiNLI - Natural Language Inference model for Scandinavian Languages
This model is a fine-tuned version of NbAiLab/nb-bert-large for Natural Language Inference in Danish, Norwegian Bokmål and Swedish.
We have released three models for Scandinavian NLI, of different sizes:
The performance and model size of each of them can be found in the Performance section below.
Quick start
You can use this model in your scripts as follows:
>>> from transformers import pipeline
>>> classifier = pipeline(
... "zero-shot-classification",
... model="alexandrainst/nb-bert-large-nli-scandi",
... )
>>> classifier(
... "Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'",
... candidate_labels=['sundhed', 'politik', 'sport', 'religion'],
... hypothesis_template="Dette eksempel handler om {}",
... )
{'sequence': "Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'",
'labels': ['sport', 'religion', 'politik', 'sundhed'],
'scores': [0.6134647727012634,
0.30309760570526123,
0.05021871626377106,
0.03321893885731697]}
Performance
As Danish is, as far as we are aware, the only Scandinavian language with a gold standard NLI dataset, namely the DanFEVER dataset, we report evaluation scores on the test split of that dataset.
We report Matthew's Correlation Coefficient (MCC), macro-average F1-score as well as accuracy.
Model | MCC | Macro-F1 | Accuracy | Number of Parameters |
---|---|---|---|---|
alexandrainst/scandi-nli-large (this) |
73.80% | 58.41% | 86.98% | 354M |
alexandrainst/scandi-nli-base |
62.44% | 55.00% | 80.42% | 178M |
alexandrainst/scandi-nli-small |
47.28% | 48.88% | 73.46% | 22M |
Training procedure
It has been fine-tuned on a dataset composed of DanFEVER as well as machine translated versions of MultiNLI and CommitmentBank into all three languages, and machine translated versions of FEVER and Adversarial NLI into Swedish.
The three languages are sampled equally during training, and they're validated on validation splits of DanFEVER and machine translated versions of MultiNLI for Swedish and Norwegian Bokmål, sampled equally.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 4242
- gradient_accumulation_steps: 16
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- max_steps: 50,000