nicholasKluge commited on
Commit
11fe9b3
1 Parent(s): 8d2d0de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -22
README.md CHANGED
@@ -281,9 +281,9 @@ Evaluations on benchmarks were performed using the [Language Model Evaluation Ha
281
 
282
  Evaluations on Brazilian Portuguese benchmarks were performed using a [Portuguese implementation of the EleutherAI LM Evaluation Harness](https://github.com/eduagarcia/lm-evaluation-harness-pt) (created by [Eduardo Garcia](https://github.com/eduagarcia/lm-evaluation-harness-pt)).
283
 
284
- | | **ASSIN2 RTE** | **ASSIN2 STS** | **BLUEX** | **ENEM** | **FAQUAD NLI** | **HateBR** | **OAB Exams** |
285
- |-----------------------|----------------|----------------|-----------|----------|----------------|------------|---------------|
286
- | **Mula-4x160-v0.1** | 33.55 | 8.88 | 20.58 | 20.08 | 43.97 | 33.65 | 22.92 |
287
 
288
  ## Cite as 🤗
289
 
@@ -301,22 +301,3 @@ Evaluations on Brazilian Portuguese benchmarks were performed using a [Portugues
301
  ## License
302
 
303
  Mula-4x160-v0.1 is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
304
-
305
-
306
- # Open Portuguese LLM Leaderboard Evaluation Results
307
-
308
- Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/MulaBR/Mula-4x160-v0.1) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
309
-
310
- | Metric | Value |
311
- |--------------------------|---------|
312
- |Average |**26.24**|
313
- |ENEM Challenge (No Images)| 21.34|
314
- |BLUEX (No Images) | 25.17|
315
- |OAB Exams | 25.06|
316
- |Assin2 RTE | 33.57|
317
- |Assin2 STS | 11.35|
318
- |FaQuAD NLI | 43.97|
319
- |HateBR Binary | 41.50|
320
- |PT Hate Speech Binary | 22.99|
321
- |tweetSentBR | 11.24|
322
-
 
281
 
282
  Evaluations on Brazilian Portuguese benchmarks were performed using a [Portuguese implementation of the EleutherAI LM Evaluation Harness](https://github.com/eduagarcia/lm-evaluation-harness-pt) (created by [Eduardo Garcia](https://github.com/eduagarcia/lm-evaluation-harness-pt)).
283
 
284
+ | | **ASSIN2 RTE** | **ASSIN2 STS** | **BLUEX** | **ENEM** | **FAQUAD NLI** | **HateBR** | **OAB Exams** | **TweetSentBR** |
285
+ |-----------------------|----------------|----------------|-----------|----------|----------------|------------|---------------|-----------------|
286
+ | **Mula-4x160-v0.1** | 33.57 | 11.35 | 25.17 | 21.34 | 43.97 | 41.50 | 25.06 | 11.24 |
287
 
288
  ## Cite as 🤗
289
 
 
301
  ## License
302
 
303
  Mula-4x160-v0.1 is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.