README.md · yang31210999/Llama-3.1-Minitron-4B-Depth-Neo-10w at main

metadata

datasets:
  - BAAI/Infinity-Instruct
base_model:
  - nvidia/Llama-3.1-Minitron-4B-Depth-Base

We fine-tune nvidia/Llama-3.1-Minitron-4B-Depth-Base with LLM-Neo method，which combines LoRA and KD in one. Training data is sampling from BAAI/Infinity-Instruct for 100k lines.

Benchmarks

In this section, we report the results for Llama-3.1-Minitron-4B-Depth-Neo-10w on standard automatic benchmarks. For all the evaluations, we use lm-evaluation-harness library.

Evaluation results

Category	Benchmark	Version	n-shot	Metric	Value	Stderr
BBH	BBH (General)	N/A	3	exact_match	0.4729	± 0.0055
	BBH (Boolean Expressions)	2	3	exact_match	0.8120	± 0.0248
	BBH (Date Understanding)	2	3	exact_match	0.6600	± 0.0300
CEVAL	CEVAL (General)	N/A	0	acc	0.4413	± 0.0135
	CEVAL (Accountant)	1	0	acc	0.3469	± 0.0687
	CEVAL (Advanced Mathematics)	1	0	acc	0.4737	± 0.1177
	CEVAL (Art Studies)	1	0	acc	0.4545	± 0.0880
MMLU	MMLU (General)	N/A	0	acc	0.6048	± 0.0039
	MMLU (Humanities)	N/A	0	acc	0.5552	± 0.0067
	MMLU (STEM)	N/A	0	acc	0.5214	± 0.0086
CMMLU	CMMLU (General)	N/A	0	acc	0.3548	± 0.0044
CMMLU	CMMLU (Normalized)	N/A	0	acc_norm	0.3548	± 0.0044