Bram Vanroy
commited on
Commit
•
c66a031
1
Parent(s):
b268b1d
get rid of whitespace
Browse files- content.py +6 -5
content.py
CHANGED
@@ -8,11 +8,12 @@ This is a fork of the [Open Multilingual LLM Evaluation Leaderboard](https://hug
|
|
8 |
We test the models on the following benchmarks **for the Dutch version only!!**, which have been translated into Dutch automatically by the original authors of the Open Multilingual LLM Evaluation Leaderboard with `gpt-35-turbo`.
|
9 |
I did not verify their translations and I do not maintain the datasets, I only run the benchmarks and add the results to this space. For questions regarding the test sets or running them yourself, see [the original Github repository](https://github.com/laiviet/lm-evaluation-harness).
|
10 |
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
|
|
16 |
"""
|
17 |
|
18 |
DISCLAIMER = """## Disclaimer
|
|
|
8 |
We test the models on the following benchmarks **for the Dutch version only!!**, which have been translated into Dutch automatically by the original authors of the Open Multilingual LLM Evaluation Leaderboard with `gpt-35-turbo`.
|
9 |
I did not verify their translations and I do not maintain the datasets, I only run the benchmarks and add the results to this space. For questions regarding the test sets or running them yourself, see [the original Github repository](https://github.com/laiviet/lm-evaluation-harness).
|
10 |
|
11 |
+
<p align="center">
|
12 |
+
<a href="https://arxiv.org/abs/1803.05457" target="_blank">AI2 Reasoning Challenge </a> (25-shot) |
|
13 |
+
<a href="https://arxiv.org/abs/1905.07830" target="_blank">HellaSwag</a> (10-shot) |
|
14 |
+
<a href="https://arxiv.org/abs/2009.03300" target="_blank">MMLU</a> (5-shot) |
|
15 |
+
<a href="https://arxiv.org/abs/2109.07958" target="_blank">TruthfulQA</a> (0-shot)
|
16 |
+
</p>
|
17 |
"""
|
18 |
|
19 |
DISCLAIMER = """## Disclaimer
|