--- language: - ar datasets: - mc4 - oscar - arabic_billion_words --- # arabic-t5-small This is a T5v1.1 (small) trained on the concatenation of the Arabic Billion Words corpus and the Arabic subsets of the mC4 and Oscar datasets. The model could only be trained for about `10%` of the whole dataset due to time limitations. ## Training parameters | | | | :-------------------: | :-----------: | | steps | `22'000` | | Training batch size | `384` | | Evaluation batch size | `768` | | learning rate | `1e-2` | | dtype | `jnp.float32` | ## Results | | | | :-----------------: | :-----------: | | evaluation accuracy | `56.84%` | | evaluation loss | `2.423` | | training loss | `2.392` | | training time | `22h 23m 51s` | ## Note for finetuning This model was pretrained with dropout turned off, so the default `dropout_rate` in the model config is `0`. To finetune the model dropout should be turned be back on, like this: ```python model = T5ForConditionalGeneration.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1) ``` or, ```python model = AutoModelForSeq2SeqLM.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1) ```