arabic-t5-small / README.md
salti's picture
Undo readme overwrite mistake
37e00b7
|
raw
history blame
1.1 kB
metadata
language:
  - ar
datasets:
  - mc4
  - oscar
  - arabic_billion_words

arabic-t5-small

This is a T5v1.1 (small) trained on the concatenation of the Arabic Billion Words corpus and the Arabic subsets of the mC4 and Oscar datasets. The model could only be trained for about 10% of the whole dataset due to time limitations.

Training parameters

steps 22'000
Training batch size 384
Evaluation batch size 768
learning rate 1e-2
dtype jnp.float32

Note for finetuning:

This model was pretrained with dropout turned off, so the default dropout_rate in the model config is 0. To finetune the model dropout should be turned be back on, like this:

model = T5ForConditionalGeneration.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)

or,

model = AutoModelForSeq2SeqLM.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)