metadata
pipeline_tag: zero-shot-classification
language:
- da
- 'no'
- nb
- sv
license: mit
datasets:
- strombergnlp/danfever
- mnli_da
- mnli_sv
- mnli_nb
- cb_da
- cb_sv
- cb_nb
- fever_sv
- anli_sv
model-index:
- name: nb-bert-large-ner-scandi
results: []
widget:
- example_title: Danish
text: >-
Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke
finder dig'
candidate_labels: sundhed, politik, sport, religion
hypothesis_template: Dette eksempel handler om {}
- example_title: Norwegian
text: >-
Folkehelseinstituttets mest optimistiske anslag er at alle voksne er
ferdigvaksinert innen midten av september.
candidate_labels: helse, politikk, sport, religion
hypothesis_template: Dette eksemplet handler om {}
- example_title: Swedish
text: >-
”Öppna, öppna” - hundratals demonstrerade mot hårda covidrestriktioner i
Peking
candidate_labels: hälsa, politik, sport, religion
hypothesis_template: Det här exemplet handlar om {}
ScandiNLI - Natural Language Inference model for Scandinavian Languages
This model is a fine-tuned version of NbAiLab/nb-bert-large for Natural Language Inference in Danish, Norwegian Bokmål and Swedish.
It has been fine-tuned on a dataset composed of DanFEVER as well as machine translated versions of MultiNLI and CommitmentBank into all three languages, and machine translated versions of FEVER and Adversarial NLI into Swedish.
The three languages are sampled equally during training, and they're validated on validation splits of DanFEVER and machine translated versions of MultiNLI for Swedish and Norwegian Bokmål, sampled equally.
Quick start
You can use this model in your scripts as follows:
>>> from transformers import pipeline
>>> classifier = pipeline(
... "zero-shot-classification",
... model="alexandrainst/nb-bert-large-nli-scandi",
... )
>>> classifier(
... "Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'",
... candidate_labels=['sundhed', 'politik', 'sport', 'religion'],
... hypothesis_template="Dette eksempel handler om {}",
... )
{'sequence': "Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'",
'labels': ['sport', 'religion', 'politik', 'sundhed'],
'scores': [0.6134647727012634,
0.30309760570526123,
0.05021871626377106,
0.03321893885731697]}
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 4242
- gradient_accumulation_steps: 16
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- max_steps: 50,000