metadata

license: mit
base_model: gpt2
tags:
  - generated_from_trainer
model-index:
  - name: distily_bench_gpt2_activation_loss_b
    results: []

distily_bench_gpt2_activation_loss_b

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	0	0	6.0052
1.8938	0.0808	1000	1.9915
1.7378	0.1616	2000	1.8007
1.7337	0.2424	3000	1.6868
1.5761	0.3232	4000	1.5975
1.5284	0.4040	5000	1.5055
1.3739	0.4848	6000	1.4362
1.4322	0.5657	7000	1.3724
1.3508	0.6465	8000	1.3171
1.3875	0.7273	9000	1.2684
1.3039	0.8081	10000	1.2160
1.2591	0.8889	11000	1.1656
1.2144	0.9697	12000	1.1263