Adding Evaluation Results

#1
Files changed (1) hide show
  1. README.md +118 -1
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  tags:
3
  - merge
4
  - mergekit
@@ -14,7 +15,109 @@ base_model:
14
  - MSL7/INEX12-7b
15
  - automerger/YamShadow-7B
16
  - Kukedlc/NeuralSirKrishna-7b
17
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ---
19
 
20
  # NeuralArjuna-7B-DT
@@ -158,3 +261,17 @@ concrete and the abstract, the empirical and the speculative. While the journey
158
  we must continue to explore these frontiers, drawing upon the rich tapestry of human knowledge, in the hope of forging a more comprehensive narrative of our cosmos and
159
  our place within it.
160
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
  tags:
4
  - merge
5
  - mergekit
 
15
  - MSL7/INEX12-7b
16
  - automerger/YamShadow-7B
17
  - Kukedlc/NeuralSirKrishna-7b
18
+ model-index:
19
+ - name: NeuralArjuna-7B-DT
20
+ results:
21
+ - task:
22
+ type: text-generation
23
+ name: Text Generation
24
+ dataset:
25
+ name: AI2 Reasoning Challenge (25-Shot)
26
+ type: ai2_arc
27
+ config: ARC-Challenge
28
+ split: test
29
+ args:
30
+ num_few_shot: 25
31
+ metrics:
32
+ - type: acc_norm
33
+ value: 73.12
34
+ name: normalized accuracy
35
+ source:
36
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Kukedlc/NeuralArjuna-7B-DT
37
+ name: Open LLM Leaderboard
38
+ - task:
39
+ type: text-generation
40
+ name: Text Generation
41
+ dataset:
42
+ name: HellaSwag (10-Shot)
43
+ type: hellaswag
44
+ split: validation
45
+ args:
46
+ num_few_shot: 10
47
+ metrics:
48
+ - type: acc_norm
49
+ value: 88.97
50
+ name: normalized accuracy
51
+ source:
52
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Kukedlc/NeuralArjuna-7B-DT
53
+ name: Open LLM Leaderboard
54
+ - task:
55
+ type: text-generation
56
+ name: Text Generation
57
+ dataset:
58
+ name: MMLU (5-Shot)
59
+ type: cais/mmlu
60
+ config: all
61
+ split: test
62
+ args:
63
+ num_few_shot: 5
64
+ metrics:
65
+ - type: acc
66
+ value: 64.63
67
+ name: accuracy
68
+ source:
69
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Kukedlc/NeuralArjuna-7B-DT
70
+ name: Open LLM Leaderboard
71
+ - task:
72
+ type: text-generation
73
+ name: Text Generation
74
+ dataset:
75
+ name: TruthfulQA (0-shot)
76
+ type: truthful_qa
77
+ config: multiple_choice
78
+ split: validation
79
+ args:
80
+ num_few_shot: 0
81
+ metrics:
82
+ - type: mc2
83
+ value: 76.68
84
+ source:
85
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Kukedlc/NeuralArjuna-7B-DT
86
+ name: Open LLM Leaderboard
87
+ - task:
88
+ type: text-generation
89
+ name: Text Generation
90
+ dataset:
91
+ name: Winogrande (5-shot)
92
+ type: winogrande
93
+ config: winogrande_xl
94
+ split: validation
95
+ args:
96
+ num_few_shot: 5
97
+ metrics:
98
+ - type: acc
99
+ value: 85.24
100
+ name: accuracy
101
+ source:
102
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Kukedlc/NeuralArjuna-7B-DT
103
+ name: Open LLM Leaderboard
104
+ - task:
105
+ type: text-generation
106
+ name: Text Generation
107
+ dataset:
108
+ name: GSM8k (5-shot)
109
+ type: gsm8k
110
+ config: main
111
+ split: test
112
+ args:
113
+ num_few_shot: 5
114
+ metrics:
115
+ - type: acc
116
+ value: 70.81
117
+ name: accuracy
118
+ source:
119
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Kukedlc/NeuralArjuna-7B-DT
120
+ name: Open LLM Leaderboard
121
  ---
122
 
123
  # NeuralArjuna-7B-DT
 
261
  we must continue to explore these frontiers, drawing upon the rich tapestry of human knowledge, in the hope of forging a more comprehensive narrative of our cosmos and
262
  our place within it.
263
  ```
264
+
265
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
266
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Kukedlc__NeuralArjuna-7B-DT)
267
+
268
+ | Metric |Value|
269
+ |---------------------------------|----:|
270
+ |Avg. |76.58|
271
+ |AI2 Reasoning Challenge (25-Shot)|73.12|
272
+ |HellaSwag (10-Shot) |88.97|
273
+ |MMLU (5-Shot) |64.63|
274
+ |TruthfulQA (0-shot) |76.68|
275
+ |Winogrande (5-shot) |85.24|
276
+ |GSM8k (5-shot) |70.81|
277
+