Wanfq commited on
Commit
246adf7
1 Parent(s): 2df4cd6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +132 -1
README.md CHANGED
@@ -2,6 +2,9 @@
2
  license: apache-2.0
3
  language:
4
  - en
 
 
 
5
  pipeline_tag: text-generation
6
  tags:
7
  - mistral
@@ -10,6 +13,121 @@ tags:
10
  - model-fusion
11
  - fusechat
12
  library_name: transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
  <p align="center" width="100%">
15
  </p>
@@ -412,4 +530,17 @@ If you find this work is relevant with your research or applications, please fee
412
  journal={arXiv preprint arXiv:2402.16107},
413
  year={2024}
414
  }
415
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  language:
4
  - en
5
+ base_model: openchat/openchat_3.5
6
+ datasets:
7
+ - FuseAI/FuseChat-Mixture
8
  pipeline_tag: text-generation
9
  tags:
10
  - mistral
 
13
  - model-fusion
14
  - fusechat
15
  library_name: transformers
16
+ model-index:
17
+ - name: OpenChat-3.5-7B-Mixtral
18
+ results:
19
+ - task:
20
+ type: text-generation
21
+ name: Text Generation
22
+ dataset:
23
+ name: MT-Bench
24
+ type: unknown
25
+ metrics:
26
+ - type: unknown
27
+ value: 8.08
28
+ name: score
29
+ source:
30
+ url: https://huggingface.co/spaces/lmsys/mt-bench
31
+ - task:
32
+ type: text-generation
33
+ name: Text Generation
34
+ dataset:
35
+ name: AI2 Reasoning Challenge (25-Shot)
36
+ type: ai2_arc
37
+ config: ARC-Challenge
38
+ split: test
39
+ args:
40
+ num_few_shot: 25
41
+ metrics:
42
+ - type: acc_norm
43
+ value: 62.8
44
+ name: normalized accuracy
45
+ source:
46
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=FuseAI/OpenChat-3.5-7B-Mixtral
47
+ name: Open LLM Leaderboard
48
+ - task:
49
+ type: text-generation
50
+ name: Text Generation
51
+ dataset:
52
+ name: HellaSwag (10-Shot)
53
+ type: hellaswag
54
+ split: validation
55
+ args:
56
+ num_few_shot: 10
57
+ metrics:
58
+ - type: acc_norm
59
+ value: 84.24
60
+ name: normalized accuracy
61
+ source:
62
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=FuseAI/OpenChat-3.5-7B-Mixtral
63
+ name: Open LLM Leaderboard
64
+ - task:
65
+ type: text-generation
66
+ name: Text Generation
67
+ dataset:
68
+ name: MMLU (5-Shot)
69
+ type: cais/mmlu
70
+ config: all
71
+ split: test
72
+ args:
73
+ num_few_shot: 5
74
+ metrics:
75
+ - type: acc
76
+ value: 63.95
77
+ name: accuracy
78
+ source:
79
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=FuseAI/OpenChat-3.5-7B-Mixtral
80
+ name: Open LLM Leaderboard
81
+ - task:
82
+ type: text-generation
83
+ name: Text Generation
84
+ dataset:
85
+ name: TruthfulQA (0-shot)
86
+ type: truthful_qa
87
+ config: multiple_choice
88
+ split: validation
89
+ args:
90
+ num_few_shot: 0
91
+ metrics:
92
+ - type: mc2
93
+ value: 45.68
94
+ source:
95
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=FuseAI/OpenChat-3.5-7B-Mixtral
96
+ name: Open LLM Leaderboard
97
+ - task:
98
+ type: text-generation
99
+ name: Text Generation
100
+ dataset:
101
+ name: Winogrande (5-shot)
102
+ type: winogrande
103
+ config: winogrande_xl
104
+ split: validation
105
+ args:
106
+ num_few_shot: 5
107
+ metrics:
108
+ - type: acc
109
+ value: 79.64
110
+ name: accuracy
111
+ source:
112
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=FuseAI/OpenChat-3.5-7B-Mixtral
113
+ name: Open LLM Leaderboard
114
+ - task:
115
+ type: text-generation
116
+ name: Text Generation
117
+ dataset:
118
+ name: GSM8k (5-shot)
119
+ type: gsm8k
120
+ config: main
121
+ split: test
122
+ args:
123
+ num_few_shot: 5
124
+ metrics:
125
+ - type: acc
126
+ value: 62.09
127
+ name: accuracy
128
+ source:
129
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=FuseAI/OpenChat-3.5-7B-Mixtral
130
+ name: Open LLM Leaderboard
131
  ---
132
  <p align="center" width="100%">
133
  </p>
 
530
  journal={arXiv preprint arXiv:2402.16107},
531
  year={2024}
532
  }
533
+ ```
534
+
535
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
536
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_FuseAI__OpenChat-3.5-7B-Mixtral)
537
+
538
+ | Metric |Value|
539
+ |---------------------------------|----:|
540
+ |Avg. |66.40|
541
+ |AI2 Reasoning Challenge (25-Shot)|62.80|
542
+ |HellaSwag (10-Shot) |84.24|
543
+ |MMLU (5-Shot) |63.95|
544
+ |TruthfulQA (0-shot) |45.68|
545
+ |Winogrande (5-shot) |79.64|
546
+ |GSM8k (5-shot) |62.09|