llm-leaderboard / README.md

Commit History

Fix
fbe8ba2
Running

Ludwig Stumpp commited on

Fix altair
4b731ad

Ludwig Stumpp commited on

Merge branch 'main' into hf-launch
cfeff2f

Ludwig Stumpp commited on

Add llama 2 model
412a418

Ludwig Stumpp commited on

Merge remote-tracking branch 'origin/main' into hf-launch
5a5e2af

Ludwig Stumpp commited on

Add falcon 7b model
af4ecbe

Ludwig Stumpp commited on

Add falcon 40b model
e4e6ff0

Ludwig Stumpp commited on

Add gpt4all-13b-snoozy model
2544151

Ludwig Stumpp commited on

Add text-davinci-003 results on HellaSwag and WinoGrande zero-shot
15b03fa

Ludwig Stumpp commited on

Add koala results on HellaSwag and WinoGrande zero-shot
265c39e

Ludwig Stumpp commited on

Add stablelm results on HellaSwag and WinoGrande zero-shot
a011af1

Ludwig Stumpp commited on

Add oasst/pythia-12b HellaSwag and WinoGrande zero-shot results
12a4ec3

Ludwig Stumpp commited on

Add Pythia models WinoGrande (zero shot)
a10f910

Ludwig Stumpp commited on

Add alpaca 7b model
b199af5

Ludwig Stumpp commited on

Add dolly-v2-12b results
b75e1d2

Ludwig Stumpp commited on

Test styling in HF
2148115

Ludwig Stumpp commited on

Move over to hf spaces
f952412

Ludwig Stumpp commited on

Add python version for HF spaces
667b277

Ludwig Stumpp commited on

Prepare HF spaces launch
b9e518e

Ludwig Stumpp commited on

Remove prompted StartCoder
205deb7

Ludwig Stumpp commited on

Notes on definition of "open" model
2322286

Ludwig Stumpp commited on

Remove codeT results for code-davinci-002 as not comparable to other HumanEval results, due to additional explicit testing of outputs
72edf21

Ludwig Stumpp commited on

Add link to hf space
b7e4ee9

Ludwig Stumpp commited on

Replace commercial column with open
1c52cdd

Ludwig Stumpp commited on

Add WinoGrande zero-shot and results
f452fea

Ludwig Stumpp commited on

Add WinoGrande few shot results for gpt4 and 3.5
eedd6a6

Ludwig Stumpp commited on

Shown values in categorical filter now sorted
9770a07

Ludwig Stumpp commited on

For now set PaLM2 commercial use to no until clear
f3fd684

Ludwig Stumpp commited on

Starting to add PaLM2 benchmark results
fe8088e

Ludwig Stumpp commited on

Add download button for table + update todos
ea40e33

Ludwig Stumpp commited on

Add column for publisher
9d7638e

Ludwig Stumpp commited on

Add further results from HELM
8f06941

Ludwig Stumpp commited on

Add / modify gpt models according to HELM benchmark
4373b29

Ludwig Stumpp commited on

Clarifying gpt model names
669c882

Ludwig Stumpp commited on

Text work
84a7c6d

Ludwig Stumpp commited on

Fix GPT -3 commercial use
4d54a13

Ludwig Stumpp commited on

Add special thanks
d9a0906

Ludwig Stumpp commited on

Add replit code
8c37256

Ludwig Stumpp commited on

Add HellaSwag few shot
9e47a75

Ludwig Stumpp commited on

Add llama results on hellaswag zero shot
6147ea1

Ludwig Stumpp commited on

Add HellaSwag Benchmark
e1aeb72

Ludwig Stumpp commited on

Align human eval format
5e1e4f6

Ludwig Stumpp commited on

Add BLOOM model
360209c

Ludwig Stumpp commited on

Add MMLU few shot
9c17477

Ludwig Stumpp commited on

Add galactica model
21aaac9

Ludwig Stumpp commited on

Rearrange and link to open-llms repo
a60d3ed

Ludwig Stumpp commited on

Align writing
a3504d1

Ludwig Stumpp commited on

Adding missing links to eval scores for MLU task
f7cfe3e

Ludwig Stumpp commited on

Add column for commercial use + logic in streamlit app + disclaimer
5323497

Ludwig Stumpp commited on

Adding MMLU dataset and removing source table
c0dd25e

Ludwig Stumpp commited on