mistral-nemo-wissenschaft-12B
mistralai/Mistral-Nemo-Instruct-2407 finetuned on tasksource/ScienceQA_text_only.
Method
Finetuned using an A100 on Google Colab for 1 epoch. Correct answers were selected as the chosen answer, a random wrong answer was selected as "rejected."
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 24.58 |
IFEval (0-Shot) | 65.20 |
BBH (3-Shot) | 29.57 |
MATH Lvl 5 (4-Shot) | 6.57 |
GPQA (0-shot) | 5.70 |
MuSR (0-shot) | 12.29 |
MMLU-PRO (5-shot) | 28.14 |
- Downloads last month
- 5,058
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for nbeerbower/mistral-nemo-wissenschaft-12B
Base model
mistralai/Mistral-Nemo-Base-2407
Finetuned
mistralai/Mistral-Nemo-Instruct-2407
Dataset used to train nbeerbower/mistral-nemo-wissenschaft-12B
Spaces using nbeerbower/mistral-nemo-wissenschaft-12B 5
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard65.200
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard29.570
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard6.570
- acc_norm on GPQA (0-shot)Open LLM Leaderboard5.700
- acc_norm on MuSR (0-shot)Open LLM Leaderboard12.290
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard28.140