Commits · ludwigstumpp/llm-leaderboard

Fix

fbe8ba2

Running

Ludwig Stumpp commited on Aug 23

Fix altair

4b731ad

Ludwig Stumpp commited on Aug 23

Merge branch 'main' into hf-launch

cfeff2f

Ludwig Stumpp commited on Aug 23

Add llama 2 model

412a418

Ludwig Stumpp commited on Aug 25, 2023

Merge remote-tracking branch 'origin/main' into hf-launch

5a5e2af

Ludwig Stumpp commited on Jun 1, 2023

Add falcon 7b model

af4ecbe

Ludwig Stumpp commited on May 28, 2023

Add falcon 40b model

e4e6ff0

Ludwig Stumpp commited on May 28, 2023

Add gpt4all-13b-snoozy model

2544151

Ludwig Stumpp commited on May 18, 2023

Add text-davinci-003 results on HellaSwag and WinoGrande zero-shot

15b03fa

Ludwig Stumpp commited on May 18, 2023

Add koala results on HellaSwag and WinoGrande zero-shot

265c39e

Ludwig Stumpp commited on May 18, 2023

Add stablelm results on HellaSwag and WinoGrande zero-shot

a011af1

Ludwig Stumpp commited on May 18, 2023

Add oasst/pythia-12b HellaSwag and WinoGrande zero-shot results

12a4ec3

Ludwig Stumpp commited on May 18, 2023

Add Pythia models WinoGrande (zero shot)

a10f910

Ludwig Stumpp commited on May 18, 2023

Add alpaca 7b model

b199af5

Ludwig Stumpp commited on May 18, 2023

Add dolly-v2-12b results

b75e1d2

Ludwig Stumpp commited on May 18, 2023

Test styling in HF

2148115

Ludwig Stumpp commited on May 11, 2023

Move over to hf spaces

f952412

Ludwig Stumpp commited on May 11, 2023

Add python version for HF spaces

667b277

Ludwig Stumpp commited on May 11, 2023

Prepare HF spaces launch

b9e518e

Ludwig Stumpp commited on May 11, 2023

Remove prompted StartCoder

205deb7

Ludwig Stumpp commited on May 16, 2023

Notes on definition of "open" model

2322286

Ludwig Stumpp commited on May 16, 2023

Remove codeT results for code-davinci-002 as not comparable to other HumanEval results, due to additional explicit testing of outputs

72edf21

Ludwig Stumpp commited on May 16, 2023

Add link to hf space

b7e4ee9

Ludwig Stumpp commited on May 11, 2023

Replace commercial column with open

1c52cdd

Ludwig Stumpp commited on May 11, 2023

Add WinoGrande zero-shot and results

f452fea

Ludwig Stumpp commited on May 11, 2023

Add WinoGrande few shot results for gpt4 and 3.5

eedd6a6

Ludwig Stumpp commited on May 11, 2023

Shown values in categorical filter now sorted

9770a07

Ludwig Stumpp commited on May 10, 2023

For now set PaLM2 commercial use to no until clear

f3fd684

Ludwig Stumpp commited on May 10, 2023

Starting to add PaLM2 benchmark results

fe8088e

Ludwig Stumpp commited on May 10, 2023

Add download button for table + update todos

ea40e33

Ludwig Stumpp commited on May 10, 2023

Add column for publisher

9d7638e

Ludwig Stumpp commited on May 10, 2023

Add further results from HELM

8f06941

Ludwig Stumpp commited on May 10, 2023

Add / modify gpt models according to HELM benchmark

4373b29

Ludwig Stumpp commited on May 10, 2023