bert-tiny-ita is an italian foundational model (based on bert-tiny) pretrained from scratch on 20k italian Wikipedia articles and on a wide collection of italian words and dictionary definitions. It uses 512 context window size.

The project is still a work in progress, new versions will come with time.

Use it as a foundational model to be finetuned for specific italian tasks.

Training

  • epochs: 250
  • lr: 1e-5
  • optim: AdamW
  • weight_decay: 1e-4

Eval

  • perplexity: 45 (it's a 12MB model!)
Downloads last month
65
Safetensors
Model size
3.06M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mascIT/bert-tiny-ita

Finetunes
1 model