collapse_gemma-2-2b_hs2_iter1_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.2754	0.0511	5	1.2634	285512
1.2153	0.1021	10	1.1738	578296
1.1556	0.1532	15	1.1356	873440
1.1445	0.2042	20	1.1093	1168560
1.0672	0.2553	25	1.0991	1463952
1.1502	0.3063	30	1.0940	1754024
1.0342	0.3574	35	1.0895	2046160
1.0635	0.4084	40	1.0863	2341224
1.1419	0.4595	45	1.0834	2635056
1.0155	0.5105	50	1.0805	2927424
1.0927	0.5616	55	1.0777	3221968
1.1001	0.6126	60	1.0756	3519568
1.0711	0.6637	65	1.0734	3816688
1.0622	0.7147	70	1.0717	4117768
1.0785	0.7658	75	1.0702	4418488
1.154	0.8168	80	1.0691	4709408
1.1034	0.8679	85	1.0678	5000912
1.0458	0.9190	90	1.0663	5295112
1.0685	0.9700	95	1.0651	5591032