Ludwig Stumpp commited on
Commit
f3a8621
β€’
1 Parent(s): 1d376a9

Remove links in table headers

Browse files
Files changed (1) hide show
  1. README.md +23 -22
README.md CHANGED
@@ -1,6 +1,7 @@
1
  # πŸ† llm-leaderboard
2
 
3
  A joint community effort to create one central leaderboard for LLMs. Contributions and corrections welcome!
 
4
 
5
  ## Interactive Dashboard
6
 
@@ -20,28 +21,28 @@ We are always happy for contributions! You can contribute by the following:
20
 
21
  ## Leaderboard
22
 
23
- | Model Name | [Chatbot Arena Elo](https://lmsys.org/blog/2023-05-03-arena/) | [LAMBADA (zero-shot)](https://arxiv.org/abs/1606.06031) | [TriviaQA (zero-shot)](https://arxiv.org/abs/1705.03551v2 ) |
24
- | -------------------------------------------------------------------------------------- | ------------------------------------------------------------- | ------------------------------------------------------- | ----------------------------------------------------------- |
25
- | [alpaca-13b](https://crfm.stanford.edu/2023/03/13/alpaca.html) | [1008](https://lmsys.org/blog/2023-05-03-arena/) | | |
26
- | [cerebras-gpt-7b](https://huggingface.co/cerebras/Cerebras-GPT-6.7B) | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | [0.141](https://www.mosaicml.com/blog/mpt-7b) |
27
- | [cerebras-gpt-13b](https://huggingface.co/cerebras/Cerebras-GPT-13B) | | [0.635](https://www.mosaicml.com/blog/mpt-7b) | [0.146](https://www.mosaicml.com/blog/mpt-7b) |
28
- | [chatglm-6b](https://chatglm.cn/blog) | [985](https://lmsys.org/blog/2023-05-03-arena/) | | |
29
- | [dolly-v2-12b](https://huggingface.co/databricks/dolly-v2-12b) | [944](https://lmsys.org/blog/2023-05-03-arena/) | | |
30
- | [eleuther-pythia-7b](https://huggingface.co/EleutherAI/pythia-6.9b) | | [0.667](https://www.mosaicml.com/blog/mpt-7b) | [0.198](https://www.mosaicml.com/blog/mpt-7b) |
31
- | [eleuther-pythia-12b](https://huggingface.co/EleutherAI/pythia-12b) | | [0.704](https://www.mosaicml.com/blog/mpt-7b) | [0.233](https://www.mosaicml.com/blog/mpt-7b) |
32
- | [fastchat-t5-3b](https://huggingface.co/lmsys/fastchat-t5-3b-v1.0) | [951](https://lmsys.org/blog/2023-05-03-arena/) | | |
33
- | [gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) | | [0.719](https://www.mosaicml.com/blog/mpt-7b) | [0.347](https://www.mosaicml.com/blog/mpt-7b) |
34
- | [gptj-6b](https://huggingface.co/EleutherAI/gpt-j-6b) | | [0.683](https://www.mosaicml.com/blog/mpt-7b) | [0.234](https://www.mosaicml.com/blog/mpt-7b) |
35
- | [koala-13b](https://bair.berkeley.edu/blog/2023/04/03/koala/) | [1082](https://lmsys.org/blog/2023-05-03-arena/) | | |
36
- | [llama-7b](https://arxiv.org/abs/2302.13971) | | [0.738](https://www.mosaicml.com/blog/mpt-7b) | [0.443](https://www.mosaicml.com/blog/mpt-7b) |
37
- | [llama-13b](https://arxiv.org/abs/2302.13971) | [932](https://lmsys.org/blog/2023-05-03-arena/) | | |
38
- | [mpt-7b](https://huggingface.co/mosaicml/mpt-7b) | | [0.702](https://www.mosaicml.com/blog/mpt-7b) | [0.343](https://www.mosaicml.com/blog/mpt-7b) |
39
- | [oasst-pythia-12b](https://huggingface.co/OpenAssistant/pythia-12b-pre-v8-12.5k-steps) | [1065](https://lmsys.org/blog/2023-05-03-arena/) | | |
40
- | [opt-7b](https://huggingface.co/facebook/opt-6.7b) | | [0.677](https://www.mosaicml.com/blog/mpt-7b) | [0.227](https://www.mosaicml.com/blog/mpt-7b) |
41
- | [opt-13b](https://huggingface.co/facebook/opt-13b) | | [0.692](https://www.mosaicml.com/blog/mpt-7b) | [0.282](https://www.mosaicml.com/blog/mpt-7b) |
42
- | [stablelm-base-alpha-7b](https://huggingface.co/stabilityai/stablelm-base-alpha-7b) | | [0.533](https://www.mosaicml.com/blog/mpt-7b) | [0.049](https://www.mosaicml.com/blog/mpt-7b) |
43
- | [stablelm-tuned-alpha-7b](https://huggingface.co/stabilityai/stablelm-tuned-alpha-7b) | [858](https://lmsys.org/blog/2023-05-03-arena/) | | |
44
- | [vicuna-13b](https://huggingface.co/lmsys/vicuna-13b-delta-v0) | [1169](https://lmsys.org/blog/2023-05-03-arena/) | | |
45
 
46
  ## Benchmarks
47
 
 
1
  # πŸ† llm-leaderboard
2
 
3
  A joint community effort to create one central leaderboard for LLMs. Contributions and corrections welcome!
4
+ Sources for the numbers are
5
 
6
  ## Interactive Dashboard
7
 
 
21
 
22
  ## Leaderboard
23
 
24
+ | Model Name | Chatbot Arena Elo | LAMBADA (zero-shot) | TriviaQA (zero-shot) |
25
+ | -------------------------------------------------------------------------------------- | ------------------------------------------------ | --------------------------------------------- | --------------------------------------------- |
26
+ | [alpaca-13b](https://crfm.stanford.edu/2023/03/13/alpaca.html) | [1008](https://lmsys.org/blog/2023-05-03-arena/) | | |
27
+ | [cerebras-gpt-7b](https://huggingface.co/cerebras/Cerebras-GPT-6.7B) | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | [0.141](https://www.mosaicml.com/blog/mpt-7b) |
28
+ | [cerebras-gpt-13b](https://huggingface.co/cerebras/Cerebras-GPT-13B) | | [0.635](https://www.mosaicml.com/blog/mpt-7b) | [0.146](https://www.mosaicml.com/blog/mpt-7b) |
29
+ | [chatglm-6b](https://chatglm.cn/blog) | [985](https://lmsys.org/blog/2023-05-03-arena/) | | |
30
+ | [dolly-v2-12b](https://huggingface.co/databricks/dolly-v2-12b) | [944](https://lmsys.org/blog/2023-05-03-arena/) | | |
31
+ | [eleuther-pythia-7b](https://huggingface.co/EleutherAI/pythia-6.9b) | | [0.667](https://www.mosaicml.com/blog/mpt-7b) | [0.198](https://www.mosaicml.com/blog/mpt-7b) |
32
+ | [eleuther-pythia-12b](https://huggingface.co/EleutherAI/pythia-12b) | | [0.704](https://www.mosaicml.com/blog/mpt-7b) | [0.233](https://www.mosaicml.com/blog/mpt-7b) |
33
+ | [fastchat-t5-3b](https://huggingface.co/lmsys/fastchat-t5-3b-v1.0) | [951](https://lmsys.org/blog/2023-05-03-arena/) | | |
34
+ | [gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) | | [0.719](https://www.mosaicml.com/blog/mpt-7b) | [0.347](https://www.mosaicml.com/blog/mpt-7b) |
35
+ | [gptj-6b](https://huggingface.co/EleutherAI/gpt-j-6b) | | [0.683](https://www.mosaicml.com/blog/mpt-7b) | [0.234](https://www.mosaicml.com/blog/mpt-7b) |
36
+ | [koala-13b](https://bair.berkeley.edu/blog/2023/04/03/koala/) | [1082](https://lmsys.org/blog/2023-05-03-arena/) | | |
37
+ | [llama-7b](https://arxiv.org/abs/2302.13971) | | [0.738](https://www.mosaicml.com/blog/mpt-7b) | [0.443](https://www.mosaicml.com/blog/mpt-7b) |
38
+ | [llama-13b](https://arxiv.org/abs/2302.13971) | [932](https://lmsys.org/blog/2023-05-03-arena/) | | |
39
+ | [mpt-7b](https://huggingface.co/mosaicml/mpt-7b) | | [0.702](https://www.mosaicml.com/blog/mpt-7b) | [0.343](https://www.mosaicml.com/blog/mpt-7b) |
40
+ | [oasst-pythia-12b](https://huggingface.co/OpenAssistant/pythia-12b-pre-v8-12.5k-steps) | [1065](https://lmsys.org/blog/2023-05-03-arena/) | | |
41
+ | [opt-7b](https://huggingface.co/facebook/opt-6.7b) | | [0.677](https://www.mosaicml.com/blog/mpt-7b) | [0.227](https://www.mosaicml.com/blog/mpt-7b) |
42
+ | [opt-13b](https://huggingface.co/facebook/opt-13b) | | [0.692](https://www.mosaicml.com/blog/mpt-7b) | [0.282](https://www.mosaicml.com/blog/mpt-7b) |
43
+ | [stablelm-base-alpha-7b](https://huggingface.co/stabilityai/stablelm-base-alpha-7b) | | [0.533](https://www.mosaicml.com/blog/mpt-7b) | [0.049](https://www.mosaicml.com/blog/mpt-7b) |
44
+ | [stablelm-tuned-alpha-7b](https://huggingface.co/stabilityai/stablelm-tuned-alpha-7b) | [858](https://lmsys.org/blog/2023-05-03-arena/) | | |
45
+ | [vicuna-13b](https://huggingface.co/lmsys/vicuna-13b-delta-v0) | [1169](https://lmsys.org/blog/2023-05-03-arena/) | | |
46
 
47
  ## Benchmarks
48