--- language: - pt license: apache-2.0 datasets: - nicholasKluge/Pt-Corpus model-index: - name: Mistral-7B-v0.2-Base_ptbr results: - task: type: text-generation name: Text Generation dataset: name: ENEM Challenge (No Images) type: eduagarcia/enem_challenge split: train args: num_few_shot: 3 metrics: - type: acc value: 64.94 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BLUEX (No Images) type: eduagarcia-temp/BLUEX_without_images split: train args: num_few_shot: 3 metrics: - type: acc value: 53.96 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: OAB Exams type: eduagarcia/oab_exams split: train args: num_few_shot: 3 metrics: - type: acc value: 45.42 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Assin2 RTE type: assin2 split: test args: num_few_shot: 15 metrics: - type: f1_macro value: 90.11 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Assin2 STS type: eduagarcia/portuguese_benchmark split: test args: num_few_shot: 15 metrics: - type: pearson value: 72.51 name: pearson source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: FaQuAD NLI type: ruanchaves/faquad-nli split: test args: num_few_shot: 15 metrics: - type: f1_macro value: 69.04 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HateBR Binary type: ruanchaves/hatebr split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 79.62 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: PT Hate Speech Binary type: hate_speech_portuguese split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 58.52 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: tweetSentBR type: eduagarcia/tweetsentbr_fewshot split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 62.32 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=JJhooww/Mistral-7B-v0.2-Base_ptbr name: Open Portuguese LLM Leaderboard --- É um modelo base pré-treinado com cerca de 1b tokens em portugues iniciado com os pesos oficiais do modelo, deve ser utilizado para fine tuning. Obs: Aguardando [resultados oficiais](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) | | Mistral Base PTBR | Mistral Base | Melhoria | |------------------------------|-------------------|--------------|----------| | assin2_rte | 90,2 | 87,74 | 2,46 | | assin2_sts | 72,45 | 67,05 | 5,4 | | bluex | 53,27 | 53,27 | 0 | | enem | 64,66 | 62,42 | 2,24 | | faquad_nli | 68,11 | 47,63 | 20,48 | | hatebr_offensive_binary | 79,65 | 77,63 | 2,02 | | oab_exams | 45,42 | 45,24 | 0,18 | | portuguese_hate_speech_binary| 59,18 | 55,72 | 3,46 | # Open Portuguese LLM Leaderboard Evaluation Results Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/JJhooww/Mistral-7B-v0.2-Base_ptbr) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) | Metric | Value | |--------------------------|---------| |Average |**66.27**| |ENEM Challenge (No Images)| 64.94| |BLUEX (No Images) | 53.96| |OAB Exams | 45.42| |Assin2 RTE | 90.11| |Assin2 STS | 72.51| |FaQuAD NLI | 69.04| |HateBR Binary | 79.62| |PT Hate Speech Binary | 58.52| |tweetSentBR | 62.32|