metadata
language:
- ar
datasets:
- mc4
- oscar
- arabic_billion_words
arabic-t5-small
This is a T5v1.1 (small) trained on the concatenation of the Arabic Billion Words corpus and the Arabic subsets of the mC4 and Oscar datasets. The model could only be trained for about 10%
of the whole dataset due to time limitations.
Training parameters
steps | 22'000 |
Training batch size | 384 |
Evaluation batch size | 768 |
learning rate | 1e-2 |
dtype | jnp.float32 |
Note for finetuning:
This model was pretrained with dropout turned off, so the default dropout_rate
in the model config is 0
.
To finetune the model dropout should be turned be back on, like this:
model = T5ForConditionalGeneration.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
or,
model = AutoModelForSeq2SeqLM.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)