922CA commited on
Commit
c2b337c
1 Parent(s): ed80df3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -4
README.md CHANGED
@@ -1,15 +1,33 @@
1
  ---
2
- license: cc-by-sa-4.0
3
  datasets:
4
  - facebook/belebele
5
  ---
6
- Pretrained toy model. Made with Andrej Karpathy's NanoGPT, ~2023. Trained on part of Tagalog portion of Belebele.
7
 
8
- Parameters:
 
9
  * batch_size = 64
10
  * block_size = 256
11
  * n_layer = 8
12
  * n_head = 8
13
  * n_embd = 768
 
14
 
15
- Everything else is left as is.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: other
3
  datasets:
4
  - facebook/belebele
5
  ---
6
+ Pretrained toy models. Made with Andrej Karpathy's NanoGPT.
7
 
8
+ # nano_35m
9
+ * Trained late 2023 on part of Tagalog portion of Belebele.
10
  * batch_size = 64
11
  * block_size = 256
12
  * n_layer = 8
13
  * n_head = 8
14
  * n_embd = 768
15
+ * Everything else is left as is.
16
 
17
+ # nano_76m
18
+ * Trained January 2024 on part of Tagalog portion of Belebele.
19
+ * batch_size = 64
20
+ * block_size = 256
21
+ * n_layer = 11
22
+ * n_head = 16
23
+ * n_embd = 768
24
+ * Everything else is left as is.
25
+
26
+ # nano-ito_35m
27
+ * Trained March 2024 on part of PALITO Tagalog dataset.
28
+ * batch_size = 64
29
+ * block_size = 256
30
+ * n_layer = 11
31
+ * n_head = 16
32
+ * n_embd = 512
33
+ * Everything else is left as is.