leaderboard-pr-bot commited on
Commit
a66cc15
1 Parent(s): 9993be3

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +22 -15
README.md CHANGED
@@ -1,5 +1,7 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
3
  model-index:
4
  - name: Seraph-7B
5
  results:
@@ -18,8 +20,7 @@ model-index:
18
  value: 67.83
19
  name: normalized accuracy
20
  source:
21
- url: >-
22
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
23
  name: Open LLM Leaderboard
24
  - task:
25
  type: text-generation
@@ -35,8 +36,7 @@ model-index:
35
  value: 86.22
36
  name: normalized accuracy
37
  source:
38
- url: >-
39
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
40
  name: Open LLM Leaderboard
41
  - task:
42
  type: text-generation
@@ -53,8 +53,7 @@ model-index:
53
  value: 65.07
54
  name: accuracy
55
  source:
56
- url: >-
57
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
58
  name: Open LLM Leaderboard
59
  - task:
60
  type: text-generation
@@ -70,8 +69,7 @@ model-index:
70
  - type: mc2
71
  value: 59.49
72
  source:
73
- url: >-
74
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
75
  name: Open LLM Leaderboard
76
  - task:
77
  type: text-generation
@@ -88,8 +86,7 @@ model-index:
88
  value: 80.66
89
  name: accuracy
90
  source:
91
- url: >-
92
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
93
  name: Open LLM Leaderboard
94
  - task:
95
  type: text-generation
@@ -106,11 +103,8 @@ model-index:
106
  value: 71.87
107
  name: accuracy
108
  source:
109
- url: >-
110
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
111
  name: Open LLM Leaderboard
112
- tags:
113
- - merge
114
  ---
115
 
116
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/ddzjZ1irvtLcDRCWei9vQ.png)
@@ -198,4 +192,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
198
 
199
  If you would like to support me:
200
 
201
- [☕ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ tags:
4
+ - merge
5
  model-index:
6
  - name: Seraph-7B
7
  results:
 
20
  value: 67.83
21
  name: normalized accuracy
22
  source:
23
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
 
24
  name: Open LLM Leaderboard
25
  - task:
26
  type: text-generation
 
36
  value: 86.22
37
  name: normalized accuracy
38
  source:
39
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
 
40
  name: Open LLM Leaderboard
41
  - task:
42
  type: text-generation
 
53
  value: 65.07
54
  name: accuracy
55
  source:
56
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
 
57
  name: Open LLM Leaderboard
58
  - task:
59
  type: text-generation
 
69
  - type: mc2
70
  value: 59.49
71
  source:
72
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
 
73
  name: Open LLM Leaderboard
74
  - task:
75
  type: text-generation
 
86
  value: 80.66
87
  name: accuracy
88
  source:
89
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
 
90
  name: Open LLM Leaderboard
91
  - task:
92
  type: text-generation
 
103
  value: 71.87
104
  name: accuracy
105
  source:
106
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Seraph-7B
 
107
  name: Open LLM Leaderboard
 
 
108
  ---
109
 
110
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/ddzjZ1irvtLcDRCWei9vQ.png)
 
192
 
193
  If you would like to support me:
194
 
195
+ [☕ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)
196
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
197
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__Seraph-7B)
198
+
199
+ | Metric |Value|
200
+ |---------------------------------|----:|
201
+ |Avg. |71.86|
202
+ |AI2 Reasoning Challenge (25-Shot)|67.83|
203
+ |HellaSwag (10-Shot) |86.22|
204
+ |MMLU (5-Shot) |65.07|
205
+ |TruthfulQA (0-shot) |59.49|
206
+ |Winogrande (5-shot) |80.66|
207
+ |GSM8k (5-shot) |71.87|
208
+