Adding the Open Portuguese LLM Leaderboard Evaluation Results

#3
Files changed (1) hide show
  1. README.md +22 -3
README.md CHANGED
@@ -1,11 +1,11 @@
1
  ---
2
- base_model: winglian/m12b-20240721-test010
3
  tags:
4
  - generated_from_trainer
 
5
  model-index:
6
  - name: outputs/simpo-out
7
  results: []
8
- license: apache-2.0
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -153,4 +153,23 @@ The following hyperparameters were used during training:
153
  - Transformers 4.43.1
154
  - Pytorch 2.3.1+cu121
155
  - Datasets 2.19.1
156
- - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
  tags:
4
  - generated_from_trainer
5
+ base_model: winglian/m12b-20240721-test010
6
  model-index:
7
  - name: outputs/simpo-out
8
  results: []
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
153
  - Transformers 4.43.1
154
  - Pytorch 2.3.1+cu121
155
  - Datasets 2.19.1
156
+ - Tokenizers 0.19.1
157
+
158
+
159
+ # Open Portuguese LLM Leaderboard Evaluation Results
160
+
161
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/axolotl-ai-co/romulus-mistral-nemo-12b-simpo) and on the [πŸš€ Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
162
+
163
+ | Metric | Value |
164
+ |--------------------------|---------|
165
+ |Average |**71.97**|
166
+ |ENEM Challenge (No Images)| 70.96|
167
+ |BLUEX (No Images) | 60.78|
168
+ |OAB Exams | 53.62|
169
+ |Assin2 RTE | 90.52|
170
+ |Assin2 STS | 78.70|
171
+ |FaQuAD NLI | 68.05|
172
+ |HateBR Binary | 84.42|
173
+ |PT Hate Speech Binary | 71.65|
174
+ |tweetSentBR | 69.05|
175
+