metadata
datasets:
- c4
language:
- en
metrics:
- accuracy
pipeline_tag: fill-mask
A small version of DeBERTa
trained on the clean version of google C4 dataset. For more info about the size of the model, see config.json
.
The model has been trained for 100K steps with a batch size of 2048 and a sequence length of 512, for a total of 104B tokens.
The vocabulary and the tokenizer are the same as microsoft/deberta-base
.