polemma-base / README.md
anowakowski's picture
Update README.md
2b1838a
|
raw
history blame
1.2 kB
metadata
language: pl
tags:
  - T5
  - lemmatization
license: apache-2.0

PoLemma Base

PoLemma models are intended for lemmatization of named entities and multi-word expressions in the Polish language.

They were fine-tuned from the allegro/plT5 models, e.g.: allegro/plt5-base.

Usage

Sample usage:

from transformers import pipeline

pipe = pipeline(task="text2text-generation", model="amu-cai/polemma-base", tokenizer="amu-cai/polemma-base")
hyp = [res['generated_text'] for res in pipe(["federalnego urzędu statystycznego"], clean_up_tokenization_spaces=True, num_beams=5)][0]

Evaluation results

Lemmatization Exact Match was computed on the SlavNER 2021 test set.

Model Exact Match
polemma-large 92.61
polemma-base 91.34
polemma-small 88.46

Citation

If you use the model, please cite the following paper:

TBD

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1.post200
  • Datasets 2.9.0
  • Tokenizers 0.13.2