BramVanroy leaderboard-pr-bot commited on
Commit
0c8f0de
1 Parent(s): 364e785

Adding Evaluation Results (#1)

Browse files

- Adding Evaluation Results (abd70264799da3090e06af828a642e191f01554c)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +19 -6
README.md CHANGED
@@ -1,21 +1,21 @@
1
  ---
 
 
2
  license: mit
3
- base_model: BramVanroy/fietje-2-instruct
4
  tags:
5
  - trl
6
  - fietje
7
  - alignment-handbook
8
  - dpo
 
9
  datasets:
10
  - BramVanroy/ultra_feedback_dutch_cleaned
11
  - BramVanroy/orca_dpo_pairs_dutch_cleaned
 
 
12
  model-index:
13
  - name: fietje-2-chat
14
  results: []
15
- pipeline_tag: text-generation
16
- inference: false
17
- language:
18
- - nl
19
  ---
20
 
21
  <p align="center" style="margin:0;padding:0">
@@ -91,4 +91,17 @@ The following hyperparameters were used during training:
91
  - Transformers 4.39.1
92
  - Pytorch 2.1.2+cu121
93
  - Datasets 2.18.0
94
- - Tokenizers 0.15.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - nl
4
  license: mit
 
5
  tags:
6
  - trl
7
  - fietje
8
  - alignment-handbook
9
  - dpo
10
+ base_model: BramVanroy/fietje-2-instruct
11
  datasets:
12
  - BramVanroy/ultra_feedback_dutch_cleaned
13
  - BramVanroy/orca_dpo_pairs_dutch_cleaned
14
+ pipeline_tag: text-generation
15
+ inference: false
16
  model-index:
17
  - name: fietje-2-chat
18
  results: []
 
 
 
 
19
  ---
20
 
21
  <p align="center" style="margin:0;padding:0">
 
91
  - Transformers 4.39.1
92
  - Pytorch 2.1.2+cu121
93
  - Datasets 2.18.0
94
+ - Tokenizers 0.15.2
95
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
96
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_BramVanroy__fietje-2-chat)
97
+
98
+ | Metric |Value|
99
+ |-------------------|----:|
100
+ |Avg. |10.39|
101
+ |IFEval (0-Shot) |29.17|
102
+ |BBH (3-Shot) |17.72|
103
+ |MATH Lvl 5 (4-Shot)| 0.53|
104
+ |GPQA (0-shot) | 0.00|
105
+ |MuSR (0-shot) | 3.20|
106
+ |MMLU-PRO (5-shot) |11.72|
107
+