README.md · lucadiliello/deberta-small at main

metadata

datasets:
  - c4
language:
  - en
metrics:
  - accuracy
pipeline_tag: fill-mask

A small version of DeBERTa trained on the clean version of google C4 dataset. For more info about the size of the model, see config.json.

The model has been trained for 100K steps with a batch size of 2048 and a sequence length of 512, for a total of 104B tokens.

The vocabulary and the tokenizer are the same as microsoft/deberta-base.