collapse_gemma-2-2b_hs2_replace_iter4_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5632	0.0316	5	1.3061	246640
1.3817	0.0632	10	1.2228	494576
1.0035	0.0947	15	1.2471	744584
0.6544	0.1263	20	1.4073	997672
0.4776	0.1579	25	1.5377	1254560
0.3655	0.1895	30	1.6643	1501936
0.2114	0.2211	35	1.8147	1753752
0.1432	0.2527	40	2.0060	2004664
0.0971	0.2842	45	2.1422	2255696
0.0583	0.3158	50	2.1872	2503680
0.0617	0.3474	55	2.2333	2752312
0.0418	0.3790	60	2.2179	3014008
0.0354	0.4106	65	2.2580	3272640
0.0341	0.4422	70	2.3017	3531768
0.0365	0.4737	75	2.3306	3783288
0.0388	0.5053	80	2.3409	4030184
0.0293	0.5369	85	2.3008	4283032
0.0542	0.5685	90	2.2747	4542640
0.0333	0.6001	95	2.2006	4797104
0.0314	0.6317	100	2.1578	5049888
0.0504	0.6632	105	2.1483	5293872
0.0344	0.6948	110	2.1589	5538240
0.0277	0.7264	115	2.1630	5793368
0.0281	0.7580	120	2.1890	6044800
0.0289	0.7896	125	2.2083	6302168
0.0336	0.8212	130	2.2451	6546744
0.0442	0.8527	135	2.2112	6795368
0.0292	0.8843	140	2.1838	7042064
0.036	0.9159	145	2.2140	7291040
0.0295	0.9475	150	2.2179	7545048
0.0291	0.9791	155	2.1915	7794264