hbertv1-massive-logit_KD-tiny

This model is a fine-tuned version of gokuls/model_v1_complete_training_wt_init_48_tiny_freeze_new on the massive dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
4.0471	1.0	180	3.2580	0.2258
2.9727	2.0	360	2.3478	0.3778
2.3183	3.0	540	1.8643	0.5081
1.9162	4.0	720	1.5331	0.6375
1.6284	5.0	900	1.3079	0.6931
1.4163	6.0	1080	1.1495	0.7241
1.263	7.0	1260	1.0287	0.7437
1.1491	8.0	1440	0.9566	0.7575
1.0652	9.0	1620	0.8881	0.7644
0.9661	10.0	1800	0.8426	0.7801
0.9077	11.0	1980	0.7980	0.7796
0.8466	12.0	2160	0.7675	0.7919
0.7996	13.0	2340	0.7422	0.7934
0.7605	14.0	2520	0.7323	0.7954
0.7156	15.0	2700	0.6864	0.8067
0.6867	16.0	2880	0.6730	0.8131
0.6493	17.0	3060	0.6548	0.8160
0.6245	18.0	3240	0.6495	0.8136
0.6038	19.0	3420	0.6282	0.8224
0.57	20.0	3600	0.6123	0.8224
0.556	21.0	3780	0.6020	0.8308
0.5334	22.0	3960	0.5943	0.8298
0.5101	23.0	4140	0.5778	0.8323
0.4948	24.0	4320	0.5740	0.8337
0.4824	25.0	4500	0.5772	0.8337
0.4728	26.0	4680	0.5712	0.8342
0.4596	27.0	4860	0.5691	0.8337
0.4436	28.0	5040	0.5670	0.8396
0.4367	29.0	5220	0.5542	0.8367
0.4249	30.0	5400	0.5512	0.8406
0.4117	31.0	5580	0.5450	0.8387
0.4051	32.0	5760	0.5468	0.8465
0.4	33.0	5940	0.5464	0.8401
0.3939	34.0	6120	0.5451	0.8446
0.3801	35.0	6300	0.5387	0.8441
0.3708	36.0	6480	0.5353	0.8421
0.3686	37.0	6660	0.5320	0.8455