jetmoe
/

jetmoe-8b

YikangS commited on Mar 26, 2024

Commit

fe991da

1 Parent(s): 89856fc

update readme

Files changed (1) hide show

README.md CHANGED Viewed

@@ -15,11 +15,11 @@ To our surprise, JetMoE-8B performs even better than LLaMA2-7B, LLaMA-13B, and D
 Compared to a model with similar training and inference computation, like Gemma-2B, JetMoE-8B achieves significantly better performance.
 ## Evaluation Results
-For most benchmarks, we use the same evaluation methodology as in the Open LLM leaderboard. For code benchmarks, we use the same evaluation methodology as in the LLaMA2 and Deepseek MoE paper. The evaluation results are as follows:
 |Model|Activate Params|Training Tokens|ARC-challenge|Hellaswag|MMLU|TruthfulQA|WinoGrande|GSM8k|Open LLM Leaderboard Average|MBPP|HumanEval|
 |---|---|---|---|---|---|---|---|---|---|---|---|
-|shot|||25|10|5|0|5|5||3|0|
-||||acc_norm|acc_norm|acc|mc2|acc|acc||Pass@1|Pass@1|
 |LLaMA2-7B|7B|2T|53.1|78.6|46.9|38.8|74|14.5|51.0|20.8|12.8|
 |LLaMA-13B|13B|1T|**56.2**|**80.9**|47.7|39.5|**76.2**|7.6|51.4|22.0|15.8|
 |DeepseekMoE-16B|2.8B|2T|53.2|79.8|46.3|36.1|73.7|17.3|51.1|34.0|**25.0**|

 Compared to a model with similar training and inference computation, like Gemma-2B, JetMoE-8B achieves significantly better performance.
 ## Evaluation Results
+For most benchmarks, we use the same evaluation methodology as in the [Open LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). For code benchmarks, we use the same evaluation methodology as in the LLaMA2 and Deepseek MoE paper. The evaluation results are as follows:
 |Model|Activate Params|Training Tokens|ARC-challenge|Hellaswag|MMLU|TruthfulQA|WinoGrande|GSM8k|Open LLM Leaderboard Average|MBPP|HumanEval|
 |---|---|---|---|---|---|---|---|---|---|---|---|
+|Shot|||25|10|5|0|5|5||3|0|
+|Metric|||acc_norm|acc_norm|acc|mc2|acc|acc||Pass@1|Pass@1|
 |LLaMA2-7B|7B|2T|53.1|78.6|46.9|38.8|74|14.5|51.0|20.8|12.8|
 |LLaMA-13B|13B|1T|**56.2**|**80.9**|47.7|39.5|**76.2**|7.6|51.4|22.0|15.8|
 |DeepseekMoE-16B|2.8B|2T|53.2|79.8|46.3|36.1|73.7|17.3|51.1|34.0|**25.0**|