collapse_gemma-2-2b_hs2_massive_iter1_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.2529	0.0511	5	1.2585	295904
1.1843	0.1021	10	1.1683	591104
1.1523	0.1532	15	1.1318	883280
1.0979	0.2042	20	1.1062	1177976
1.0923	0.2553	25	1.0967	1470072
1.07	0.3063	30	1.0915	1759320
1.1217	0.3574	35	1.0877	2048280
1.0978	0.4084	40	1.0839	2339776
1.0604	0.4595	45	1.0807	2632712
1.0608	0.5105	50	1.0779	2926200
1.1238	0.5616	55	1.0758	3220536
1.0663	0.6126	60	1.0741	3515696
1.0059	0.6637	65	1.0724	3804824
1.0991	0.7147	70	1.0706	4101032
1.1119	0.7658	75	1.0691	4391096
1.0905	0.8168	80	1.0680	4688752
1.0574	0.8679	85	1.0668	4981792
1.1394	0.9190	90	1.0653	5276840
1.1296	0.9700	95	1.0644	5572144