collapse_gemma-2-2b_hs2_replace_iter3_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.4905	0.0322	5	1.3077	270832
1.2602	0.0643	10	1.2233	538040
1.1284	0.0965	15	1.2281	810568
0.7483	0.1287	20	1.2969	1081856
0.6342	0.1608	25	1.4400	1347624
0.4624	0.1930	30	1.6103	1613304
0.3721	0.2252	35	1.7194	1880416
0.2581	0.2573	40	1.7768	2149880
0.1611	0.2895	45	1.8426	2416712
0.1031	0.3217	50	1.9013	2681168
0.1092	0.3538	55	1.9516	2946912
0.1202	0.3860	60	1.9557	3214960
0.0956	0.4182	65	1.9342	3484184
0.0726	0.4503	70	1.8705	3756200
0.0687	0.4825	75	1.8882	4021312
0.0399	0.5147	80	1.8351	4291144
0.0562	0.5468	85	1.8887	4554544
0.0621	0.5790	90	1.8666	4829952
0.0416	0.6112	95	1.7668	5092984
0.0435	0.6433	100	1.8431	5361048
0.0669	0.6755	105	1.8500	5629424
0.064	0.7077	110	1.7670	5901224
0.0491	0.7398	115	1.7368	6163240
0.0455	0.7720	120	1.8418	6432208
0.0378	0.8042	125	1.8950	6704256
0.0423	0.8363	130	1.8546	6975512
0.08	0.8685	135	1.8218	7243344
0.061	0.9007	140	1.8678	7510512
0.0408	0.9329	145	1.9605	7787288
0.0359	0.9650	150	1.9672	8053856
0.0587	0.9972	155	1.8911	8330720