lamhieu leaderboard-pr-bot commited on
Commit
3fc77d3
1 Parent(s): 4814575

Adding Evaluation Results (#1)

Browse files

- Adding Evaluation Results (d4540048eab11a55d6e3381ed8d41d34277bd896)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +198 -74
README.md CHANGED
@@ -1,83 +1,194 @@
1
  ---
2
- library_name: transformers
3
- license: mit
4
  language:
5
  - en
6
  - vi
7
- pipeline_tag: text-generation
8
- base_model: HuggingFaceH4/zephyr-7b-beta
9
  tags:
10
  - ghost
11
- model-index:
12
- - name: lamhieu/ghost-7b-v0.9.0
13
- results:
14
- - task:
15
- type: text-generation
16
- dataset:
17
- type: vmlu_v1.5
18
- name: VMLU
19
- metrics:
20
- - name: Average
21
- type: avg
22
- value: 36.06
23
- verified: true
24
- - name: STEM
25
- type: stem
26
- value: 33.54
27
- verified: true
28
- - name: Social science
29
- type: ss
30
- value: 38.74
31
- verified: true
32
- - name: Humanities
33
- type: hm
34
- value: 37.15
35
- verified: true
36
- - name: Other
37
- type: ot
38
- value: 36.78
39
- verified: true
40
- - task:
41
- type: text-generation
42
- dataset:
43
- type: open_llm_leaderboard
44
- name: Open LLM Leaderboard
45
- source:
46
- name: Open LLM Leaderboard
47
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lamhieu/ghost-7b-v0.9.0
48
- metrics:
49
- - name: Average
50
- type: avg
51
- value: 56.89
52
- verified: true
53
- - name: ARC
54
- type: arc
55
- value: 53.07
56
- verified: true
57
- - name: HellaSwag
58
- type: hs
59
- value: 77.93
60
- verified: true
61
- - name: HellaSwag
62
- type: hs
63
- value: 77.93
64
- verified: true
65
- - name: MMLU
66
- type: mmlu
67
- value: 55.09
68
- verified: true
69
- - name: Winogrande
70
- type: wg
71
- value: 73.72
72
- verified: true
73
- - name: GSM8K
74
- type: gsm8k
75
- value: 33.74
76
- verified: true
77
  widget:
78
- - text: "<|system|>\nYou are a helpful assistant.</s>\n<|user|>\nThông tin về Peristernia despecta</s>\n<|assistant|>\n"
79
- output:
80
- text: "Peristernia despecta một loài ốc biển, là động vật thân mềm chân bụng sống ở biển trong họ Fasciolariidae."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
  ---
82
 
83
  # Model Card for Model ID
@@ -291,4 +402,17 @@ Many thanks for
291
 
292
  ## Model Card Contact
293
 
294
- **Lam H** (lamhieu.vk@gmail.com)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  language:
3
  - en
4
  - vi
5
+ license: mit
6
+ library_name: transformers
7
  tags:
8
  - ghost
9
+ pipeline_tag: text-generation
10
+ base_model: HuggingFaceH4/zephyr-7b-beta
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  widget:
12
+ - text: '<|system|>
13
+
14
+ You are a helpful assistant.</s>
15
+
16
+ <|user|>
17
+
18
+ Thông tin về Peristernia despecta</s>
19
+
20
+ <|assistant|>
21
+
22
+ '
23
+ output:
24
+ text: Peristernia despecta là một loài ốc biển, là động vật thân mềm chân bụng
25
+ sống ở biển trong họ Fasciolariidae.
26
+ model-index:
27
+ - name: lamhieu/ghost-7b-v0.9.0
28
+ results:
29
+ - task:
30
+ type: text-generation
31
+ dataset:
32
+ name: VMLU
33
+ type: vmlu_v1.5
34
+ metrics:
35
+ - type: avg
36
+ value: 36.06
37
+ name: Average
38
+ verified: true
39
+ - type: stem
40
+ value: 33.54
41
+ name: STEM
42
+ verified: true
43
+ - type: ss
44
+ value: 38.74
45
+ name: Social science
46
+ verified: true
47
+ - type: hm
48
+ value: 37.15
49
+ name: Humanities
50
+ verified: true
51
+ - type: ot
52
+ value: 36.78
53
+ name: Other
54
+ verified: true
55
+ - task:
56
+ type: text-generation
57
+ dataset:
58
+ name: Open LLM Leaderboard
59
+ type: open_llm_leaderboard
60
+ metrics:
61
+ - type: avg
62
+ value: 56.89
63
+ name: Average
64
+ verified: true
65
+ - type: arc
66
+ value: 53.07
67
+ name: ARC
68
+ verified: true
69
+ - type: hs
70
+ value: 77.93
71
+ name: HellaSwag
72
+ verified: true
73
+ - type: hs
74
+ value: 77.93
75
+ name: HellaSwag
76
+ verified: true
77
+ - type: mmlu
78
+ value: 55.09
79
+ name: MMLU
80
+ verified: true
81
+ - type: wg
82
+ value: 73.72
83
+ name: Winogrande
84
+ verified: true
85
+ - type: gsm8k
86
+ value: 33.74
87
+ name: GSM8K
88
+ verified: true
89
+ source:
90
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lamhieu/ghost-7b-v0.9.0
91
+ name: Open LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: AI2 Reasoning Challenge (25-Shot)
97
+ type: ai2_arc
98
+ config: ARC-Challenge
99
+ split: test
100
+ args:
101
+ num_few_shot: 25
102
+ metrics:
103
+ - type: acc_norm
104
+ value: 53.07
105
+ name: normalized accuracy
106
+ source:
107
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lamhieu/ghost-7b-v0.9.0
108
+ name: Open LLM Leaderboard
109
+ - task:
110
+ type: text-generation
111
+ name: Text Generation
112
+ dataset:
113
+ name: HellaSwag (10-Shot)
114
+ type: hellaswag
115
+ split: validation
116
+ args:
117
+ num_few_shot: 10
118
+ metrics:
119
+ - type: acc_norm
120
+ value: 77.93
121
+ name: normalized accuracy
122
+ source:
123
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lamhieu/ghost-7b-v0.9.0
124
+ name: Open LLM Leaderboard
125
+ - task:
126
+ type: text-generation
127
+ name: Text Generation
128
+ dataset:
129
+ name: MMLU (5-Shot)
130
+ type: cais/mmlu
131
+ config: all
132
+ split: test
133
+ args:
134
+ num_few_shot: 5
135
+ metrics:
136
+ - type: acc
137
+ value: 55.09
138
+ name: accuracy
139
+ source:
140
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lamhieu/ghost-7b-v0.9.0
141
+ name: Open LLM Leaderboard
142
+ - task:
143
+ type: text-generation
144
+ name: Text Generation
145
+ dataset:
146
+ name: TruthfulQA (0-shot)
147
+ type: truthful_qa
148
+ config: multiple_choice
149
+ split: validation
150
+ args:
151
+ num_few_shot: 0
152
+ metrics:
153
+ - type: mc2
154
+ value: 47.79
155
+ source:
156
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lamhieu/ghost-7b-v0.9.0
157
+ name: Open LLM Leaderboard
158
+ - task:
159
+ type: text-generation
160
+ name: Text Generation
161
+ dataset:
162
+ name: Winogrande (5-shot)
163
+ type: winogrande
164
+ config: winogrande_xl
165
+ split: validation
166
+ args:
167
+ num_few_shot: 5
168
+ metrics:
169
+ - type: acc
170
+ value: 73.72
171
+ name: accuracy
172
+ source:
173
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lamhieu/ghost-7b-v0.9.0
174
+ name: Open LLM Leaderboard
175
+ - task:
176
+ type: text-generation
177
+ name: Text Generation
178
+ dataset:
179
+ name: GSM8k (5-shot)
180
+ type: gsm8k
181
+ config: main
182
+ split: test
183
+ args:
184
+ num_few_shot: 5
185
+ metrics:
186
+ - type: acc
187
+ value: 33.74
188
+ name: accuracy
189
+ source:
190
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lamhieu/ghost-7b-v0.9.0
191
+ name: Open LLM Leaderboard
192
  ---
193
 
194
  # Model Card for Model ID
 
402
 
403
  ## Model Card Contact
404
 
405
+ **Lam H** (lamhieu.vk@gmail.com)
406
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
407
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lamhieu__ghost-7b-v0.9.0)
408
+
409
+ | Metric |Value|
410
+ |---------------------------------|----:|
411
+ |Avg. |56.89|
412
+ |AI2 Reasoning Challenge (25-Shot)|53.07|
413
+ |HellaSwag (10-Shot) |77.93|
414
+ |MMLU (5-Shot) |55.09|
415
+ |TruthfulQA (0-shot) |47.79|
416
+ |Winogrande (5-shot) |73.72|
417
+ |GSM8k (5-shot) |33.74|
418
+