BERTino: an Italian DistilBERT model

This repository hosts BERTino, an Italian DistilBERT model pre-trained by indigo.ai on a large general-domain Italian corpus. BERTino is task-agnostic and can be fine-tuned for every downstream task.

Corpus

The pre-training corpus that we used is the union of the Paisa and ItWaC corpora. The final corpus counts 14 millions of sentences for a total of 12 GB of text.

Downstream Results

To validate the pre-training that we conducted, we evaluated BERTino on the Italian ParTUT, Italian ISDT, Italian WikiNER and multi-class sentence classification tasks. We report for comparison results obtained by the teacher model fine-tuned in the same tasks and for the same number of epochs.

Italian ISDT:

Model F1 score Fine-tuning time Evaluation time
BERTino 0,9801 9m, 4s 3s
Teacher 0,983 16m, 28s 5s

Italian ParTUT:

Model F1 score Fine-tuning time Evaluation time
BERTino 0,9268 1m, 18s 1s
Teacher 0,9688 2m, 18s 1s

Italian WikiNER:

Model F1 score Fine-tuning time Evaluation time
BERTino 0,9038 35m, 35s 3m, 1s
Teacher 0,9178 67m, 8s 5m, 16s

Multi-class sentence classification:

Model F1 score Fine-tuning time Evaluation time
BERTino 0,7788 4m, 40s 6s
Teacher 0,7986 8m, 52s 9s
Downloads last month
764
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.