collapse_gemma-2-2b_hs2_replace_iter6_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.6219	0.0316	5	1.3079	253136
1.2932	0.0631	10	1.2378	505728
0.797	0.0947	15	1.2940	753248
0.6825	0.1263	20	1.4640	1008520
0.4083	0.1579	25	1.6136	1265712
0.2934	0.1894	30	1.7972	1522896
0.1426	0.2210	35	1.9343	1771056
0.0768	0.2526	40	2.0985	2021720
0.0598	0.2841	45	2.2231	2266568
0.0343	0.3157	50	2.2738	2525864
0.035	0.3473	55	2.3380	2773832
0.0341	0.3788	60	2.3578	3025992
0.0324	0.4104	65	2.3326	3282432
0.0339	0.4420	70	2.3815	3531744
0.0309	0.4736	75	2.4070	3780960
0.0319	0.5051	80	2.3871	4036832
0.03	0.5367	85	2.3862	4292040
0.0303	0.5683	90	2.3838	4532720
0.0295	0.5998	95	2.3943	4785512
0.0325	0.6314	100	2.3693	5041576
0.0321	0.6630	105	2.3452	5296640
0.0291	0.6946	110	2.3231	5545576
0.0271	0.7261	115	2.3197	5803840
0.025	0.7577	120	2.3552	6061792
0.0245	0.7893	125	2.3695	6314664
0.0595	0.8208	130	2.3968	6573472
0.0259	0.8524	135	2.4351	6821240
0.0262	0.8840	140	2.4190	7072472
0.0264	0.9155	145	2.4247	7323632
0.029	0.9471	150	2.4290	7572360
0.0282	0.9787	155	2.4186	7816368