Aarifkhan leaderboard-pr-bot commited on
Commit
f558892
β€’
1 Parent(s): 02f191d

Adding Evaluation Results (#1)

Browse files

- Adding Evaluation Results (67d567c98dae94ade9bb4db9c6333afb934d97c3)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +112 -4
README.md CHANGED
@@ -1,8 +1,5 @@
1
  ---
2
  license: other
3
- license_name: helpingai
4
- license_link: LICENSE.md
5
- pipeline_tag: text-generation
6
  tags:
7
  - HelpingAI
8
  - Emotionally Intelligent
@@ -11,6 +8,104 @@ datasets:
11
  - OEvortex/SentimentSynth
12
  - OEvortex/EmotionalIntelligence-75k
13
  - Abhaykoul/Emotion
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ---
15
 
16
  # HelpingAI-15B: Emotionally Intelligent Conversational AI
@@ -166,4 +261,17 @@ Remember, it's important to choose devices that fit your specific needs and budg
166
 
167
  Also, make sure to test and maintain these devices regularly to ensure they're in working order. πŸ”§
168
 
169
- If you have any specific questions about any of these devices, feel free to ask! I'm here to help you stay safe and secure! 🌈
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
 
 
 
3
  tags:
4
  - HelpingAI
5
  - Emotionally Intelligent
 
8
  - OEvortex/SentimentSynth
9
  - OEvortex/EmotionalIntelligence-75k
10
  - Abhaykoul/Emotion
11
+ license_name: helpingai
12
+ license_link: LICENSE.md
13
+ pipeline_tag: text-generation
14
+ model-index:
15
+ - name: HelpingAI-15B
16
+ results:
17
+ - task:
18
+ type: text-generation
19
+ name: Text Generation
20
+ dataset:
21
+ name: IFEval (0-Shot)
22
+ type: HuggingFaceH4/ifeval
23
+ args:
24
+ num_few_shot: 0
25
+ metrics:
26
+ - type: inst_level_strict_acc and prompt_level_strict_acc
27
+ value: 20.3
28
+ name: strict accuracy
29
+ source:
30
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OEvortex/HelpingAI-15B
31
+ name: Open LLM Leaderboard
32
+ - task:
33
+ type: text-generation
34
+ name: Text Generation
35
+ dataset:
36
+ name: BBH (3-Shot)
37
+ type: BBH
38
+ args:
39
+ num_few_shot: 3
40
+ metrics:
41
+ - type: acc_norm
42
+ value: 1.82
43
+ name: normalized accuracy
44
+ source:
45
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OEvortex/HelpingAI-15B
46
+ name: Open LLM Leaderboard
47
+ - task:
48
+ type: text-generation
49
+ name: Text Generation
50
+ dataset:
51
+ name: MATH Lvl 5 (4-Shot)
52
+ type: hendrycks/competition_math
53
+ args:
54
+ num_few_shot: 4
55
+ metrics:
56
+ - type: exact_match
57
+ value: 0.0
58
+ name: exact match
59
+ source:
60
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OEvortex/HelpingAI-15B
61
+ name: Open LLM Leaderboard
62
+ - task:
63
+ type: text-generation
64
+ name: Text Generation
65
+ dataset:
66
+ name: GPQA (0-shot)
67
+ type: Idavidrein/gpqa
68
+ args:
69
+ num_few_shot: 0
70
+ metrics:
71
+ - type: acc_norm
72
+ value: 1.01
73
+ name: acc_norm
74
+ source:
75
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OEvortex/HelpingAI-15B
76
+ name: Open LLM Leaderboard
77
+ - task:
78
+ type: text-generation
79
+ name: Text Generation
80
+ dataset:
81
+ name: MuSR (0-shot)
82
+ type: TAUR-Lab/MuSR
83
+ args:
84
+ num_few_shot: 0
85
+ metrics:
86
+ - type: acc_norm
87
+ value: 2.73
88
+ name: acc_norm
89
+ source:
90
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OEvortex/HelpingAI-15B
91
+ name: Open LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: MMLU-PRO (5-shot)
97
+ type: TIGER-Lab/MMLU-Pro
98
+ config: main
99
+ split: test
100
+ args:
101
+ num_few_shot: 5
102
+ metrics:
103
+ - type: acc
104
+ value: 1.24
105
+ name: accuracy
106
+ source:
107
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OEvortex/HelpingAI-15B
108
+ name: Open LLM Leaderboard
109
  ---
110
 
111
  # HelpingAI-15B: Emotionally Intelligent Conversational AI
 
261
 
262
  Also, make sure to test and maintain these devices regularly to ensure they're in working order. πŸ”§
263
 
264
+ If you have any specific questions about any of these devices, feel free to ask! I'm here to help you stay safe and secure! 🌈
265
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
266
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_OEvortex__HelpingAI-15B)
267
+
268
+ | Metric |Value|
269
+ |-------------------|----:|
270
+ |Avg. | 4.52|
271
+ |IFEval (0-Shot) |20.30|
272
+ |BBH (3-Shot) | 1.82|
273
+ |MATH Lvl 5 (4-Shot)| 0.00|
274
+ |GPQA (0-shot) | 1.01|
275
+ |MuSR (0-shot) | 2.73|
276
+ |MMLU-PRO (5-shot) | 1.24|
277
+