augmented_step_val_25_gemma-2-2b_hs2_iter1_sftsd2

This model is a fine-tuned version of jkazdan/step_val_25_gemma-2-2b_hs2_iter1_sftsd2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.0950	0
1.4558	0.0345	5	1.0942	274624
1.2848	0.0690	10	1.1065	546200
1.0788	0.1035	15	1.1339	817584
0.9149	0.1380	20	1.1915	1088176
0.8855	0.1725	25	1.2506	1358336
0.8151	0.2070	30	1.3419	1637992
0.5913	0.2415	35	1.3767	1911376
0.5641	0.2760	40	1.4619	2181176
0.5135	0.3105	45	1.4701	2462856
0.335	0.3450	50	1.4866	2737752
0.332	0.3795	55	1.5121	3012656
0.3655	0.4140	60	1.4798	3279744
0.249	0.4485	65	1.4564	3547808
0.2495	0.4830	70	1.4986	3817328
0.2821	0.5175	75	1.4208	4097184
0.1291	0.5520	80	1.4710	4367848
0.2026	0.5865	85	1.4296	4640592
0.2365	0.6210	90	1.5041	4922032
0.1523	0.6555	95	1.4437	5193088
0.1677	0.6900	100	1.4660	5464864
0.2233	0.7245	105	1.4473	5739032
0.1273	0.7589	110	1.4308	6012736
0.1756	0.7934	115	1.4913	6274808
0.1822	0.8279	120	1.4676	6548312
0.1255	0.8624	125	1.4698	6821112
0.1072	0.8969	130	1.4484	7098736
0.1329	0.9314	135	1.4401	7369552
0.104	0.9659	140	1.4771	7640632