something-else commited on
Commit
1e11993
·
verified ·
1 Parent(s): 2adea4e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -28,7 +28,7 @@ tags:
28
 
29
  7B rocm-rwkv pth record: I called this model Tlanuwa since I added an extra training focusing on cherokee after each run.
30
 
31
- 9B rocm-rwkv pth record: 40 layers embd=4096 ctx= 16384 I am calling this model Quetzal. I called this model Quetzal since it is a green model and I am adding an extra training focusing on Spanish and the dataset Axolotl-Spanish-Nahuatl after each run.
32
  - rwkv-9Q-stp101-N8.pth: 9B rocm-rwkv model trained with Slim pajama chunk1-10 for the first epoch and an aditional training with chunk1-2 and a mix of multi-language and code after that I am using the N8 dataset. I am currendly with the N8 dataset 4.222 GTokes. This pth has a loss of 1.904 regarding the N8 dataset.
33
 
34
 
 
28
 
29
  7B rocm-rwkv pth record: I called this model Tlanuwa since I added an extra training focusing on cherokee after each run.
30
 
31
+ 9B rocm-rwkv pth record: 40 layers embd=4096 ctx= 16384 I am calling this model Quetzal. I called this model Quetzal since it is a green model that flies and I am adding an extra training focusing on Spanish and the dataset Axolotl-Spanish-Nahuatl after each run.
32
  - rwkv-9Q-stp101-N8.pth: 9B rocm-rwkv model trained with Slim pajama chunk1-10 for the first epoch and an aditional training with chunk1-2 and a mix of multi-language and code after that I am using the N8 dataset. I am currendly with the N8 dataset 4.222 GTokes. This pth has a loss of 1.904 regarding the N8 dataset.
33
 
34