sukuya commited on
Commit
0b14f6f
1 Parent(s): 5983242

update scores for Nejumi leaderboard and date

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -45,7 +45,7 @@ Our model achieves the highest average score, more than 3 points ahead of the ne
45
 
46
  Our model achieves the highest average score, more than 5 points ahead of the next best model. We use the following commit for English LM-Harness https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463.
47
 
48
- An independent evaluation by Kamata et.al. for [Nejumi LLMリーダーボード Neo](https://wandb.ai/wandb-japan/llm-leaderboard/reports/Nejumi-LLM-Neo--Vmlldzo2MTkyMTU0#総合評価) using a weighted average of [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval) and [Japanese MT-bench](https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge) also confirms the highest performance of instruct/chat versions of RakutenAI-7B.
49
 
50
  ## Usage
51
 
 
45
 
46
  Our model achieves the highest average score, more than 5 points ahead of the next best model. We use the following commit for English LM-Harness https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463.
47
 
48
+ An independent evaluation by Kamata et.al. for [Nejumi LLMリーダーボード Neo](https://wandb.ai/wandb-japan/llm-leaderboard/reports/Nejumi-LLM-Neo--Vmlldzo2MTkyMTU0#総合評価) using a weighted average of [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval) and [Japanese MT-bench](https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge) also confirms the highest performance of chat/instruct versions of RakutenAI-7B among Open LLMs of similar sizes, with a score of 0.393/0.331 respectively, as of 22nd March 2024.
49
 
50
  ## Usage
51