nicholasKluge
commited on
Commit
•
11fe9b3
1
Parent(s):
8d2d0de
Update README.md
Browse files
README.md
CHANGED
@@ -281,9 +281,9 @@ Evaluations on benchmarks were performed using the [Language Model Evaluation Ha
|
|
281 |
|
282 |
Evaluations on Brazilian Portuguese benchmarks were performed using a [Portuguese implementation of the EleutherAI LM Evaluation Harness](https://github.com/eduagarcia/lm-evaluation-harness-pt) (created by [Eduardo Garcia](https://github.com/eduagarcia/lm-evaluation-harness-pt)).
|
283 |
|
284 |
-
| | **ASSIN2 RTE** | **ASSIN2 STS** | **BLUEX** | **ENEM** | **FAQUAD NLI** | **HateBR** | **OAB Exams** |
|
285 |
-
|
286 |
-
| **Mula-4x160-v0.1** | 33.
|
287 |
|
288 |
## Cite as 🤗
|
289 |
|
@@ -301,22 +301,3 @@ Evaluations on Brazilian Portuguese benchmarks were performed using a [Portugues
|
|
301 |
## License
|
302 |
|
303 |
Mula-4x160-v0.1 is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
|
304 |
-
|
305 |
-
|
306 |
-
# Open Portuguese LLM Leaderboard Evaluation Results
|
307 |
-
|
308 |
-
Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/MulaBR/Mula-4x160-v0.1) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
|
309 |
-
|
310 |
-
| Metric | Value |
|
311 |
-
|--------------------------|---------|
|
312 |
-
|Average |**26.24**|
|
313 |
-
|ENEM Challenge (No Images)| 21.34|
|
314 |
-
|BLUEX (No Images) | 25.17|
|
315 |
-
|OAB Exams | 25.06|
|
316 |
-
|Assin2 RTE | 33.57|
|
317 |
-
|Assin2 STS | 11.35|
|
318 |
-
|FaQuAD NLI | 43.97|
|
319 |
-
|HateBR Binary | 41.50|
|
320 |
-
|PT Hate Speech Binary | 22.99|
|
321 |
-
|tweetSentBR | 11.24|
|
322 |
-
|
|
|
281 |
|
282 |
Evaluations on Brazilian Portuguese benchmarks were performed using a [Portuguese implementation of the EleutherAI LM Evaluation Harness](https://github.com/eduagarcia/lm-evaluation-harness-pt) (created by [Eduardo Garcia](https://github.com/eduagarcia/lm-evaluation-harness-pt)).
|
283 |
|
284 |
+
| | **ASSIN2 RTE** | **ASSIN2 STS** | **BLUEX** | **ENEM** | **FAQUAD NLI** | **HateBR** | **OAB Exams** | **TweetSentBR** |
|
285 |
+
|-----------------------|----------------|----------------|-----------|----------|----------------|------------|---------------|-----------------|
|
286 |
+
| **Mula-4x160-v0.1** | 33.57 | 11.35 | 25.17 | 21.34 | 43.97 | 41.50 | 25.06 | 11.24 |
|
287 |
|
288 |
## Cite as 🤗
|
289 |
|
|
|
301 |
## License
|
302 |
|
303 |
Mula-4x160-v0.1 is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|