YikangS commited on
Commit
3dcf664
1 Parent(s): 7ad0442

update readme

Browse files
Files changed (1) hide show
  1. README.md +0 -2
README.md CHANGED
@@ -57,8 +57,6 @@ Each MoA and MoE layer has 8 expert, and 2 experts are activated for each input
57
  It has 8 billion parameters in total and 2.2B active parameters.
58
  JetMoE-8B is trained on 1.25T tokens from publicly available datasets, with a learning rate of 5.0 x 10<sup>-4</sup> and a global batch-size of 4M tokens.
59
 
60
- **Model Developers** JetMoE is developed by Yikang Shen and MyShell.
61
-
62
  **Input** Models input text only.
63
 
64
  **Output** Models generate text only.
 
57
  It has 8 billion parameters in total and 2.2B active parameters.
58
  JetMoE-8B is trained on 1.25T tokens from publicly available datasets, with a learning rate of 5.0 x 10<sup>-4</sup> and a global batch-size of 4M tokens.
59
 
 
 
60
  **Input** Models input text only.
61
 
62
  **Output** Models generate text only.