leaderboard-pr-bot commited on
Commit
e094a15
1 Parent(s): 2cbe1e9

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +118 -2
README.md CHANGED
@@ -1,11 +1,114 @@
1
  ---
2
- license: cc-by-nc-4.0
3
  language:
4
  - en
 
5
  tags:
6
  - mixtral
7
  - uncensored
8
  - high-intelligence
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  # Orochi (Alternate Version)
@@ -39,4 +142,17 @@ As an uncensored model, Orochi may generate content that is unsuitable for all a
39
 
40
  Orochi is a product of numerous contributions from the fields of machine learning and language modeling. Special thanks to the teams behind Mixtral, mergekit, and all the individual models integrated into Orochi.
41
 
42
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - en
4
+ license: cc-by-nc-4.0
5
  tags:
6
  - mixtral
7
  - uncensored
8
  - high-intelligence
9
+ model-index:
10
+ - name: MixtralOrochi8x7B-Alt
11
+ results:
12
+ - task:
13
+ type: text-generation
14
+ name: Text Generation
15
+ dataset:
16
+ name: AI2 Reasoning Challenge (25-Shot)
17
+ type: ai2_arc
18
+ config: ARC-Challenge
19
+ split: test
20
+ args:
21
+ num_few_shot: 25
22
+ metrics:
23
+ - type: acc_norm
24
+ value: 67.92
25
+ name: normalized accuracy
26
+ source:
27
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B-Alt
28
+ name: Open LLM Leaderboard
29
+ - task:
30
+ type: text-generation
31
+ name: Text Generation
32
+ dataset:
33
+ name: HellaSwag (10-Shot)
34
+ type: hellaswag
35
+ split: validation
36
+ args:
37
+ num_few_shot: 10
38
+ metrics:
39
+ - type: acc_norm
40
+ value: 86.25
41
+ name: normalized accuracy
42
+ source:
43
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B-Alt
44
+ name: Open LLM Leaderboard
45
+ - task:
46
+ type: text-generation
47
+ name: Text Generation
48
+ dataset:
49
+ name: MMLU (5-Shot)
50
+ type: cais/mmlu
51
+ config: all
52
+ split: test
53
+ args:
54
+ num_few_shot: 5
55
+ metrics:
56
+ - type: acc
57
+ value: 70.06
58
+ name: accuracy
59
+ source:
60
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B-Alt
61
+ name: Open LLM Leaderboard
62
+ - task:
63
+ type: text-generation
64
+ name: Text Generation
65
+ dataset:
66
+ name: TruthfulQA (0-shot)
67
+ type: truthful_qa
68
+ config: multiple_choice
69
+ split: validation
70
+ args:
71
+ num_few_shot: 0
72
+ metrics:
73
+ - type: mc2
74
+ value: 64.03
75
+ source:
76
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B-Alt
77
+ name: Open LLM Leaderboard
78
+ - task:
79
+ type: text-generation
80
+ name: Text Generation
81
+ dataset:
82
+ name: Winogrande (5-shot)
83
+ type: winogrande
84
+ config: winogrande_xl
85
+ split: validation
86
+ args:
87
+ num_few_shot: 5
88
+ metrics:
89
+ - type: acc
90
+ value: 80.03
91
+ name: accuracy
92
+ source:
93
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B-Alt
94
+ name: Open LLM Leaderboard
95
+ - task:
96
+ type: text-generation
97
+ name: Text Generation
98
+ dataset:
99
+ name: GSM8k (5-shot)
100
+ type: gsm8k
101
+ config: main
102
+ split: test
103
+ args:
104
+ num_few_shot: 5
105
+ metrics:
106
+ - type: acc
107
+ value: 0.0
108
+ name: accuracy
109
+ source:
110
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B-Alt
111
+ name: Open LLM Leaderboard
112
  ---
113
 
114
  # Orochi (Alternate Version)
 
142
 
143
  Orochi is a product of numerous contributions from the fields of machine learning and language modeling. Special thanks to the teams behind Mixtral, mergekit, and all the individual models integrated into Orochi.
144
 
145
+ ---
146
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
147
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_smelborp__MixtralOrochi8x7B-Alt)
148
+
149
+ | Metric |Value|
150
+ |---------------------------------|----:|
151
+ |Avg. |61.38|
152
+ |AI2 Reasoning Challenge (25-Shot)|67.92|
153
+ |HellaSwag (10-Shot) |86.25|
154
+ |MMLU (5-Shot) |70.06|
155
+ |TruthfulQA (0-shot) |64.03|
156
+ |Winogrande (5-shot) |80.03|
157
+ |GSM8k (5-shot) | 0.00|
158
+