Test merge of 7b models for learning purposees.
New in v0.2: Wanted to try a different gate type and using bfloat16, along with more detailed prompting to see if there's a noticeable difference.
Description: This model is a merge of BAAI/Infinity-Instruct-7M-Gen-mistral-7B, SanjiWatsuki/Kunoichi-7B, Gryphe_Tiamat-7b-1.1-DPO, Senseable_WestLake-7B-v2 and uukuguy/speechless-instruct-mistral-7b-v0.2 This is the first model I've ever uploaded and wanted to learn more about the process. Merged using mergekit-moe.
Works up to 8k context, 16k with 2.5 RoPe scaling
Prompt template: Custom format, or Alpaca
Alpaca: Below is an instruction that describes a task. Write a response that appropriately completes the request.
Instruction:
{prompt}
Response:
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 19.14 |
IFEval (0-Shot) | 37.52 |
BBH (3-Shot) | 30.34 |
MATH Lvl 5 (4-Shot) | 5.14 |
GPQA (0-shot) | 6.49 |
MuSR (0-shot) | 10.96 |
MMLU-PRO (5-shot) | 24.41 |
- Downloads last month
- 10
Model tree for Jacoby746/Proto-Athena-v0.2-4x7B
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard37.520
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard30.340
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard5.140
- acc_norm on GPQA (0-shot)Open LLM Leaderboard6.490
- acc_norm on MuSR (0-shot)Open LLM Leaderboard10.960
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard24.410