XLM-ROBERTA-LARGE-VIEILLE-FRANCE
This is a fine tuned version of the 'FacebookAI/xlm-roberta-large' that was trained to identify names, locations and dates in texts in ancient french. (==> it is hoped that a cross lingual transfer will occur).
The model has been fine tuned using a corpus of hand annotated texts that have been made public by the university of Tours. Unfortunately, the curated dataset cannot be republished as a huggingface dataset. The fine tuning used a cased, as well as an uncased version of the corpus to perform the training.
Note
It is very slow, but it can nevertheless run on my laptop CPU.
Evaluation
On the 'test' split of our unpublished dataset, the classification report made by seqeval was as follows:
precision recall f1-score support
DATE 0.99 1.00 0.99 492
LOC 1.00 1.00 1.00 1004
PERS 1.00 1.00 1.00 807
micro avg 1.00 1.00 1.00 2303
macro avg 1.00 1.00 1.00 2303
weighted avg 1.00 1.00 1.00 2303
Confusion matrix
- Downloads last month
- 51
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.