bert-tiny-ita is an italian foundational model (based on bert-tiny) pretrained from scratch on 20k italian Wikipedia articles and on a wide collection of italian words and dictionary definitions. It uses 512 context window size.
The project is still a work in progress, new versions will come with time.
Use it as a foundational model to be finetuned for specific italian tasks.
Training
- epochs: 250
- lr: 1e-5
- optim: AdamW
- weight_decay: 1e-4
Eval
- perplexity: 45 (it's a 12MB model!)
- Downloads last month
- 65
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.