metadata

language:
  - it
pipeline_tag: fill-mask

This model (based on bert-tiny) has been trained from scratch on 20k italian Wikipedia articles and on a wide collection of italian words and dictionary definitions.

The project is still a work in progress, new versions will come with time.

Training

epochs: 200
lr: 1e-5
optim: AdamW
weight_decay: 1e-3

Eval

perplexity: 50 (it's a 12MB model, don't expect this to be ChatGPT anytime soon :)