File size: 1,466 Bytes
b782462
 
 
 
 
 
 
 
 
619dd33
1e58999
b782462
bc21111
b782462
 
 
 
 
 
0854dfb
 
 
 
 
b782462
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

from pathlib import Path



banner_url = "https://huggingface.co/spaces/WildEval/WildBench-Leaderboard/resolve/main/%E2%80%8Eleaderboard_logo_v2.png" # the same repo here.
BANNER = f'<div style="display: flex; justify-content: space-around;"><img src="{banner_url}" alt="Banner" style="width: 40vw; min-width: 300px; max-width: 600px;"> </div>'

INTRODUCTION_TEXT= """
# OSQ Benchmark (Evaluating  LLMs with OSQs and MCQs)
πŸ”— [Website](https://vila-lab.github.io/Open-LLM-Leaderboard-Website/) | πŸ’» [GitHub](https://github.com/VILA-Lab/Open-LLM-Leaderboard) | πŸ“– [Paper](#) | 🐦 [X1](https://x.com/open_llm_lb) | 🐦 [X2](https://x.com/szq0214) 

> ### Open-LLM-Leaderboard,for evaluating large language models (LLMs) by transitioning from multiple-choice questions (MCQs) to open-style questions. 
This approach addresses the inherent biases and limitations of MCQs, such as selection bias and the effect of random guessing. By utilizing open-style questions, 
the framework aims to provide a more accurate assessment of LLMs' abilities across various benchmarks and ensure that the evaluation reflects true capabilities, 
particularly in terms of language understanding and reasoning.

"""

CITATION_TEXT = """@article{,
  title={Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena},
  author={Aidar Myrzakhan, Sondos Mahmoud Bsharat, Zhiqiang Shen},
  journal={arXiv preprint },
  year={2024},
}
"""