Prompt template:

### HUMAN:
{prompt}

### RESPONSE:
<leave a newline for the model to answer>

GGML quants available here.
GPTQ quants available here.

Note: Don't expect this model to be good, I was just starting out to finetune. So don't roast me please!

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	41.71
ARC (25-shot)	43.17
HellaSwag (10-shot)	72.68
MMLU (5-shot)	28.46
TruthfulQA (0-shot)	39.09
Winogrande (5-shot)	65.59
GSM8K (5-shot)	1.29

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	41.71
AI2 Reasoning Challenge (25-Shot)	43.17
HellaSwag (10-Shot)	72.68
MMLU (5-Shot)	28.46
TruthfulQA (0-shot)	39.09
Winogrande (5-shot)	65.59
GSM8k (5-shot)	1.29

Downloads last month: 2,059

Safetensors

Model size

3.43B params

Tensor type

F32

FP16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for acrastt/Marx-3B

Quantizations

3 models

Dataset used to train acrastt/Marx-3B

Spaces using acrastt/Marx-3B 22

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

43.170
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

72.680
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

28.460
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

39.090
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

65.590
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

1.290

View on Papers With Code