File size: 1,761 Bytes
c5ec8e7 b69e824 b2915ab 1ac43e8 b69e824 c5ec8e7 b69e824 3baefa3 b69e824 0d89beb b69e824 379d0d1 e1fd744 e8c23c4 5c1c742 e8c23c4 e1fd744 b2915ab b69e824 eb85ca4 485259e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
license: cc-by-sa-4.0
datasets:
- cjvt/cc_gigafida
- cjvt/solar3
- cjvt/sloleks
language:
- sl
tags:
- word spelling correction
---
---
language:
- sl
license: cc-by-sa-4.0
---
# T5-incorrect-word-spelling-corrector
This T5 model is designed to identify and correct words with incorrect spelling in the Slovenian language.
## Model Output Example
Consider the following Slovenian text:
_Model v besedlu popravi napaake v nepravilno črkovanih besedah._
The model might return the following text (note: predictions chosen for demonstration/explanation, not reproducibility!):
_Model v besedilu popravi napake v nepravilno črkovanih besedah._
We observe that in the input sentence, the words `besedlu` and `napaake` are incorrectly spelled, so the model corrects them to `besedilu` and `napake`.
## More details
Testing the model with **generated** test sets provides the following result (combining detection and correction of words with incorrect spelling):
- `Precission`: 0,986
- `Recall`: 0,935
- `F1`: 0,960
Testing the model, in combination with **cjvt/SloBERTa-slo-word-spelling-annotator**, with test sets constructed using the **Šolar Eval** dataset provides the following results (combining detection and correction of words with incorrect spelling):
- `Precission`: 0,823
- `Recall`: 0,796
- `F1`: 0,810
## Acknowledgement
The authors acknowledge the financial support from the Slovenian Research and Innovation Agency - research core funding No. P6-0411: Language Resources and Technologies for Slovene and research project No. J7-3159: Empirical foundations for digitally-supported development of writing skills.
## Authors
Thanks to Martin Božič, Marko Robnik-Šikonja and Špela Arhar Holdt for developing these models. |