salti commited on
Commit
37e00b7
1 Parent(s): 93d26be

Undo readme overwrite mistake

Browse files
Files changed (1) hide show
  1. README.md +37 -2
README.md CHANGED
@@ -1,3 +1,38 @@
1
- The model and logs in this directory are for a faulty run where `dropout_rate` was mistakenly set to `0.1` instead of `0`.
 
 
 
 
 
 
 
2
 
3
- The model here was trained only for `10'000` steps.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ar
4
+ datasets:
5
+ - mc4
6
+ - oscar
7
+ - arabic_billion_words
8
+ ---
9
 
10
+ # arabic-t5-small
11
+
12
+ This is a T5v1.1 (small) trained on the concatenation of the Arabic Billion Words corpus and the Arabic subsets of the mC4 and Oscar datasets. The model could only be trained for about `10%` of the whole dataset due to time limitations.
13
+
14
+ ## Training parameters
15
+
16
+ | | |
17
+ | :-------------------: | :-----------: |
18
+ | steps | `22'000` |
19
+ | Training batch size | `384` |
20
+ | Evaluation batch size | `768` |
21
+ | learning rate | `1e-2` |
22
+ | dtype | `jnp.float32` |
23
+
24
+
25
+ ## Note for finetuning:
26
+
27
+ This model was pretrained with dropout turned off, so the default `dropout_rate` in the model config is `0`.
28
+ To finetune the model dropout should be turned be back on, like this:
29
+
30
+ ```python
31
+ model = T5ForConditionalGeneration.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
32
+ ```
33
+
34
+ or,
35
+
36
+ ```python
37
+ model = AutoModelForSeq2SeqLM.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
38
+ ```