metadata
widget:
- text: >-
KOMMISSIONENS BESLUTNING
af 6. marts 2006
om klassificering af visse byggevarers ydeevne med hensyn til reaktion ved
brand for så vidt angår trægulve samt vægpaneler og vægbeklædning i
massivt træ
(meddelt under nummer K(2006) 655
datasets:
- multi_eurlex
metrics:
- accuracy
model-index:
- name: coastalcph/danish-legal-longformer-eurlex
results:
- task:
type: text-classification
name: Danish EURLEX (Level 2)
dataset:
name: multi_eurlex
type: multi_eurlex
config: multi_eurlex
split: validation
metrics:
- name: Micro-F1
type: micro-f1
value: 0.75748
- name: Macro-F1
type: macro-f1
value: 0.52883
Model description
This model is a fine-tuned version of coastalcph/danish-legal-longformer-base on the Danish part of MultiEURLEX dataset.
Training and evaluation data
The Danish part of MultiEURLEX dataset.
Use of Model
As a text classifier:
from transformers import pipeline
import numpy as np
# Init text classification pipeline
text_cls_pipe = pipeline(task="text-classification",
model="coastalcph/danish-legal-longformer-eurlex",
use_auth_token='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')
# Encode and Classify document
predictions = text_cls_pipe("KOMMISSIONENS BESLUTNING\naf 6. marts 2006\nom klassificering af visse byggevarers "
"ydeevne med hensyn til reaktion ved brand for så vidt angår trægulve samt vægpaneler "
"og vægbeklædning i massivt træ\n(meddelt under nummer K(2006) 655")
# Print prediction
print(predictions)
# [{'label': 'building and public works', 'score': 0.9626012444496155}]
As a feature extractor (document embedder):
from transformers import pipeline
import numpy as np
# Init feature extraction pipeline
feature_extraction_pipe = pipeline(task="feature-extraction",
model="coastalcph/danish-legal-longformer-eurlex",
use_auth_token='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')
# Encode document
predictions = feature_extraction_pipe("KOMMISSIONENS BESLUTNING\naf 6. marts 2006\nom klassificering af visse byggevarers "
"ydeevne med hensyn til reaktion ved brand for så vidt angår trægulve samt vægpaneler "
"og vægbeklædning i massivt træ\n(meddelt under nummer K(2006) 655")
# Use CLS token representation as document embedding
document_features = token_wise_features[0][0]
print(document_features.shape)
# (768,)
Framework versions
- Transformers 4.18.0
- Pytorch 1.12.0+cu113
- Datasets 2.0.0
- Tokenizers 0.12.1