collapse_gemma-2-2b_hs2_massive_iter1_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3437	0.0511	5	1.2590	296352
1.1851	0.1021	10	1.1699	589152
1.1271	0.1532	15	1.1327	884504
1.0728	0.2042	20	1.1071	1182424
1.0945	0.2553	25	1.0974	1474984
1.0927	0.3063	30	1.0918	1772592
1.1145	0.3574	35	1.0878	2061504
1.0845	0.4084	40	1.0841	2358064
1.1001	0.4595	45	1.0813	2650896
1.0775	0.5105	50	1.0790	2942864
1.1246	0.5616	55	1.0766	3234512
1.101	0.6126	60	1.0743	3525376
1.0904	0.6637	65	1.0729	3820376
1.1705	0.7147	70	1.0709	4108240
1.0282	0.7658	75	1.0692	4402208
1.1463	0.8168	80	1.0681	4698016
1.0783	0.8679	85	1.0668	4991408
1.0052	0.9190	90	1.0649	5285784
1.0614	0.9700	95	1.0641	5580576