.

Spanish truecasing model

This is a Spanish truecasing-model that works with the Dalton Fury Python project:

https://github.com/daltonfury42/truecase

You can install it here:

https://pypi.org/project/truecase/

Quick start

To use the Spanish model use the TrueCase.py file uploaded to this repository

https://huggingface.co/HURIDOCS/spanish_truecasing/blob/main/TrueCaser.py

Install the requirements:

pip install nltk

And ready to work:

from TrueCaser import TrueCaser

model_path = "spanish.dist"
spanish_truecasing = TrueCaser(model_path)

text = 'informe no.78/08. petición 785-05 admisibilidad. vicente arturo villanueva ortega y otros.'
print(spanish_truecasing.get_true_case(text))

Notes

The model was trained with the Europarl dataset that contains transcriptions of the European Parliament discusions:

https://www.statmt.org/europarl/ Europarl: A Parallel Corpus for Statistical Machine Translation, Philipp Koehn, MT Summit 2005

Using huggingface load_dataset:

europarl = load_dataset('large_spanish_corpus', name='Europarl')

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.