BERTino: an Italian DistilBERT model
This repository hosts BERTino, an Italian DistilBERT model pre-trained by indigo.ai on a large general-domain Italian corpus. BERTino is task-agnostic and can be fine-tuned for every downstream task.
Corpus
The pre-training corpus that we used is the union of the Paisa and ItWaC corpora. The final corpus counts 14 millions of sentences for a total of 12 GB of text.
Downstream Results
To validate the pre-training that we conducted, we evaluated BERTino on the Italian ParTUT, Italian ISDT, Italian WikiNER and multi-class sentence classification tasks. We report for comparison results obtained by the teacher model fine-tuned in the same tasks and for the same number of epochs.
Italian ISDT:
Model | F1 score | Fine-tuning time | Evaluation time |
---|---|---|---|
BERTino | 0,9801 | 9m, 4s | 3s |
Teacher | 0,983 | 16m, 28s | 5s |
Italian ParTUT:
Model | F1 score | Fine-tuning time | Evaluation time |
---|---|---|---|
BERTino | 0,9268 | 1m, 18s | 1s |
Teacher | 0,9688 | 2m, 18s | 1s |
Italian WikiNER:
Model | F1 score | Fine-tuning time | Evaluation time |
---|---|---|---|
BERTino | 0,9038 | 35m, 35s | 3m, 1s |
Teacher | 0,9178 | 67m, 8s | 5m, 16s |
Multi-class sentence classification:
Model | F1 score | Fine-tuning time | Evaluation time |
---|---|---|---|
BERTino | 0,7788 | 4m, 40s | 6s |
Teacher | 0,7986 | 8m, 52s | 9s |
- Downloads last month
- 1,250
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.