Files changed (1) hide show
  1. README.md +115 -12
README.md CHANGED
@@ -1,5 +1,14 @@
1
  ---
2
  license: osl-3.0
 
 
 
 
 
 
 
 
 
3
  model-index:
4
  - name: indus_1.175B
5
  results:
@@ -103,17 +112,98 @@ model-index:
103
  source:
104
  url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=nickmalhotra/indus_1.175B
105
  name: Open LLM Leaderboard
106
- widget:
107
- - example_title: वर्तमान प्रधानमंत्री
108
- messages:
109
- - role: user
110
- content: >-
111
- भारत के वर्तमान प्रधानमंत्री कौन हैं?
112
- - example_title: होली का महत्व
113
- messages:
114
- - role: user
115
- content: >-
116
- होली का महत्व क्या है?
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
117
  ---
118
 
119
  # Model Card for Project Indus
@@ -474,4 +564,17 @@ Project Indus LLM is designed as a foundational model suitable for further devel
474
  - **Curate Targeted Data**: Ensure the training data is relevant and of high quality to improve model performance.
475
  - **Continuous Evaluation**: Regularly assess the model's performance during and after fine-tuning to maintain accuracy and reduce biases.
476
 
477
- This disclaimer aims to provide users with a clear understanding of the model's capabilities and limitations, facilitating its effective application and development.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: osl-3.0
3
+ widget:
4
+ - example_title: वर्तमान प्रधानमंत्री
5
+ messages:
6
+ - role: user
7
+ content: भारत के वर्तमान प्रधानमंत्री कौन हैं?
8
+ - example_title: होली का महत्व
9
+ messages:
10
+ - role: user
11
+ content: होली का महत्व क्या है?
12
  model-index:
13
  - name: indus_1.175B
14
  results:
 
112
  source:
113
  url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=nickmalhotra/indus_1.175B
114
  name: Open LLM Leaderboard
115
+ - task:
116
+ type: text-generation
117
+ name: Text Generation
118
+ dataset:
119
+ name: IFEval (0-Shot)
120
+ type: HuggingFaceH4/ifeval
121
+ args:
122
+ num_few_shot: 0
123
+ metrics:
124
+ - type: inst_level_strict_acc and prompt_level_strict_acc
125
+ value: 24.4
126
+ name: strict accuracy
127
+ source:
128
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbrahme/IndusQ
129
+ name: Open LLM Leaderboard
130
+ - task:
131
+ type: text-generation
132
+ name: Text Generation
133
+ dataset:
134
+ name: BBH (3-Shot)
135
+ type: BBH
136
+ args:
137
+ num_few_shot: 3
138
+ metrics:
139
+ - type: acc_norm
140
+ value: 3.75
141
+ name: normalized accuracy
142
+ source:
143
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbrahme/IndusQ
144
+ name: Open LLM Leaderboard
145
+ - task:
146
+ type: text-generation
147
+ name: Text Generation
148
+ dataset:
149
+ name: MATH Lvl 5 (4-Shot)
150
+ type: hendrycks/competition_math
151
+ args:
152
+ num_few_shot: 4
153
+ metrics:
154
+ - type: exact_match
155
+ value: 0.0
156
+ name: exact match
157
+ source:
158
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbrahme/IndusQ
159
+ name: Open LLM Leaderboard
160
+ - task:
161
+ type: text-generation
162
+ name: Text Generation
163
+ dataset:
164
+ name: GPQA (0-shot)
165
+ type: Idavidrein/gpqa
166
+ args:
167
+ num_few_shot: 0
168
+ metrics:
169
+ - type: acc_norm
170
+ value: 2.01
171
+ name: acc_norm
172
+ source:
173
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbrahme/IndusQ
174
+ name: Open LLM Leaderboard
175
+ - task:
176
+ type: text-generation
177
+ name: Text Generation
178
+ dataset:
179
+ name: MuSR (0-shot)
180
+ type: TAUR-Lab/MuSR
181
+ args:
182
+ num_few_shot: 0
183
+ metrics:
184
+ - type: acc_norm
185
+ value: 2.25
186
+ name: acc_norm
187
+ source:
188
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbrahme/IndusQ
189
+ name: Open LLM Leaderboard
190
+ - task:
191
+ type: text-generation
192
+ name: Text Generation
193
+ dataset:
194
+ name: MMLU-PRO (5-shot)
195
+ type: TIGER-Lab/MMLU-Pro
196
+ config: main
197
+ split: test
198
+ args:
199
+ num_few_shot: 5
200
+ metrics:
201
+ - type: acc
202
+ value: 1.34
203
+ name: accuracy
204
+ source:
205
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbrahme/IndusQ
206
+ name: Open LLM Leaderboard
207
  ---
208
 
209
  # Model Card for Project Indus
 
564
  - **Curate Targeted Data**: Ensure the training data is relevant and of high quality to improve model performance.
565
  - **Continuous Evaluation**: Regularly assess the model's performance during and after fine-tuning to maintain accuracy and reduce biases.
566
 
567
+ This disclaimer aims to provide users with a clear understanding of the model's capabilities and limitations, facilitating its effective application and development.
568
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
569
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_nbrahme__IndusQ)
570
+
571
+ | Metric |Value|
572
+ |-------------------|----:|
573
+ |Avg. | 5.62|
574
+ |IFEval (0-Shot) |24.40|
575
+ |BBH (3-Shot) | 3.75|
576
+ |MATH Lvl 5 (4-Shot)| 0.00|
577
+ |GPQA (0-shot) | 2.01|
578
+ |MuSR (0-shot) | 2.25|
579
+ |MMLU-PRO (5-shot) | 1.34|
580
+