collapse_gemma-2-2b_hs2_replace_iter5_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5841	0.0316	5	1.3064	246064
1.2139	0.0632	10	1.2259	490344
0.8575	0.0948	15	1.2862	743304
0.6223	0.1264	20	1.4173	991544
0.4134	0.1580	25	1.5797	1239400
0.2181	0.1896	30	1.7675	1492208
0.1602	0.2212	35	1.9274	1742536
0.1222	0.2528	40	1.9930	1993464
0.0567	0.2844	45	2.1636	2246064
0.0601	0.3160	50	2.2179	2495240
0.0426	0.3476	55	2.2534	2753624
0.0355	0.3791	60	2.3865	3000912
0.0353	0.4107	65	2.3864	3253912
0.029	0.4423	70	2.4098	3501280
0.028	0.4739	75	2.4119	3748336
0.0282	0.5055	80	2.4352	3992216
0.0297	0.5371	85	2.4314	4238048
0.0282	0.5687	90	2.4459	4485664
0.0294	0.6003	95	2.4529	4736648
0.0266	0.6319	100	2.4423	4994408
0.0264	0.6635	105	2.4515	5241848
0.0302	0.6951	110	2.4784	5488272
0.0283	0.7267	115	2.4612	5735720
0.0491	0.7583	120	2.4475	5982808
0.0284	0.7899	125	2.4495	6233656
0.0299	0.8215	130	2.4624	6483624
0.0279	0.8531	135	2.4608	6732040
0.0282	0.8847	140	2.4580	6974112
0.0258	0.9163	145	2.4557	7221264
0.0277	0.9479	150	2.4502	7469904
0.0273	0.9795	155	2.4400	7716824