Spaces:
AIR-Bench
/
Running on CPU Upgrade

leaderboard / src /benchmarks.py

Commit History

fix: fix the typo
f3888bb

nan commited on

feat: update the default metric
4aa2126

nan commited on

feat-use-recall-as-default-metric-0605 (#18)
bbfe4c1
verified

nan commited on

fix a bug in METRIC_LIST
443f557

hanhainebula commited on

Fix bug in dataset_dict: "gpt-3" -> "gpt3"
8102fce
verified

hanhainebula commited on

Fix bug in dataset_dict: "health" -> "healthcare"
4a44211
verified

hanhainebula commited on

Add msmarco for qa task
43fbed5
verified

hanhainebula commited on

feat: improve the layout
32ebf18

nan commited on

feat: adapt to the latest data format
1a2dba5

nan commited on

chore: clean up
a96f80a

nan commited on

feat: fix the table updating
f30cbcc

nan commited on

feat: adapt UI in app.py
e8879cc

nan commited on

feat: adapt the utils in app.py
9c49811

nan commited on

feat: seperate the qa and longdoc tasks
9134169

nan commited on

feat: adapt the data loading part
8b7a945

nan commited on