metadata
language: pl
tags:
- T5
- lemmatization
license: apache-2.0
PoLemma Base
PoLemma models are intended for lemmatization of named entities and multi-word expressions in the Polish language.
They were fine-tuned from the allegro/plT5 models, e.g.: allegro/plt5-base.
Usage
Sample usage:
from transformers import pipeline
pipe = pipeline(task="text2text-generation", model="amu-cai/polemma-base", tokenizer="amu-cai/polemma-base")
hyp = [res['generated_text'] for res in pipe(["federalnego urzędu statystycznego"], clean_up_tokenization_spaces=True, num_beams=5)][0]
Evaluation results
Lemmatization Exact Match was computed on the SlavNER 2021 test set.
Model | Exact Match | |
---|---|---|
polemma-large | 92.61 | |
polemma-base | 91.34 | |
polemma-small | 88.46 |
Citation
If you use the model, please cite the following paper:
TBD
Framework versions
- Transformers 4.26.0
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2