dustalov commited on
Commit
99d07f3
·
verified ·
1 Parent(s): d351868

Add chatbot_arena_20240814.csv

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +2 -0
  3. app.py +2 -0
  4. chatbot_arena_20240814.csv +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ chatbot_arena_20240814.csv filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -16,3 +16,5 @@ license: apache-2.0
16
  [Evalica](https://github.com/dustalov/evalica) is a library for pairwise comparisons as described in paper
17
  Reliable, Reproducible, and Really Fast Leaderboards with Evalica
18
  ([arXiv](https://arxiv.org/abs/2412.11314)).
 
 
 
16
  [Evalica](https://github.com/dustalov/evalica) is a library for pairwise comparisons as described in paper
17
  Reliable, Reproducible, and Really Fast Leaderboards with Evalica
18
  ([arXiv](https://arxiv.org/abs/2412.11314)).
19
+
20
+ Chatbot Arena dataset `chatbot_arena_20240814.csv` was derived from the [clean_battle_20240814_public.json](https://storage.googleapis.com/arena_external_data/public/clean_battle_20240814_public.json) dataset available from <https://lmarena.ai/>.
app.py CHANGED
@@ -277,6 +277,8 @@ def main() -> None:
277
  ["llmfao.csv", "Average Win Rate", False, True, 100],
278
  ["llmfao.csv", "Bradley-Terry (1952)", False, True, 100],
279
  ["llmfao.csv", "Elo (1960)", False, True, 100],
 
 
280
  ],
281
  title="Evalica: Turn Your Side-by-Side Comparisons into Ranking!",
282
  description="""
 
277
  ["llmfao.csv", "Average Win Rate", False, True, 100],
278
  ["llmfao.csv", "Bradley-Terry (1952)", False, True, 100],
279
  ["llmfao.csv", "Elo (1960)", False, True, 100],
280
+ ["llmfao.csv", "Elo (1960)", False, True, 100],
281
+ ["chatbot_arena_20240814.csv", "Bradley-Terry (1952)", False, False, 0],
282
  ],
283
  title="Evalica: Turn Your Side-by-Side Comparisons into Ranking!",
284
  description="""
chatbot_arena_20240814.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ad4480918fe15df6a60860ee6e498cac730f5585588981f9ac8ee889f4e6e82
3
+ size 74002919