|
--- |
|
title: Evalica |
|
emoji: π |
|
colorFrom: green |
|
colorTo: purple |
|
sdk: gradio |
|
python_version: 3.11 |
|
sdk_version: 5.12.0 |
|
app_file: app.py |
|
pinned: true |
|
license: apache-2.0 |
|
--- |
|
|
|
# Evalica |
|
|
|
|
|
[Evalica](https://github.com/dustalov/evalica) is an easy-to-use tool transforms pairwise comparisons (*aka* side-by-side) to a meaningful ranking of items. |
|
|
|
- Ustalov, D. [Reliable, Reproducible, and Really Fast Leaderboards with Evalica](https://aclanthology.org/2025.coling-demos.6). 2025. Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations. 46–53. arXiv: [2412.11314 [cs.CL]](https://arxiv.org/abs/2412.11314). |
|
|
|
Chatbot Arena dataset `chatbot_arena_20240814.csv` was derived from the [clean_battle_20240814_public.json](https://storage.googleapis.com/arena_external_data/public/clean_battle_20240814_public.json) dataset available from <https://lmarena.ai/>. |
|
|