hbertv1-massive-logit_KD-tiny_ffn_0.5

This model is a fine-tuned version of gokuls/model_v1_complete_training_wt_init_48_tiny_freeze_new_ffn_0.5 on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6585
  • Accuracy: 0.8308

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 33
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.3023 1.0 180 3.8354 0.1766
3.7037 2.0 360 3.2686 0.2027
3.2011 3.0 540 2.8012 0.2966
2.774 4.0 720 2.4055 0.3802
2.4069 5.0 900 2.0833 0.4747
2.1164 6.0 1080 1.8300 0.5588
1.8907 7.0 1260 1.6351 0.6252
1.71 8.0 1440 1.4792 0.6621
1.5648 9.0 1620 1.3605 0.6936
1.4399 10.0 1800 1.2607 0.7103
1.3436 11.0 1980 1.1872 0.7201
1.266 12.0 2160 1.1295 0.7285
1.1934 13.0 2340 1.0829 0.7359
1.1413 14.0 2520 1.0428 0.7472
1.0807 15.0 2700 0.9984 0.7585
1.0382 16.0 2880 0.9693 0.7600
0.9982 17.0 3060 0.9439 0.7673
0.9626 18.0 3240 0.9207 0.7723
0.9299 19.0 3420 0.8887 0.7796
0.8828 20.0 3600 0.8686 0.7796
0.8593 21.0 3780 0.8537 0.7905
0.8329 22.0 3960 0.8250 0.7934
0.8043 23.0 4140 0.8098 0.7959
0.7764 24.0 4320 0.7990 0.8008
0.7569 25.0 4500 0.7823 0.8067
0.7372 26.0 4680 0.7749 0.8023
0.7182 27.0 4860 0.7640 0.8101
0.6987 28.0 5040 0.7509 0.8106
0.6842 29.0 5220 0.7386 0.8146
0.6673 30.0 5400 0.7305 0.8146
0.6509 31.0 5580 0.7196 0.8214
0.6382 32.0 5760 0.7120 0.8170
0.6301 33.0 5940 0.7134 0.8190
0.6139 34.0 6120 0.7062 0.8200
0.6076 35.0 6300 0.6928 0.8205
0.5919 36.0 6480 0.6838 0.8244
0.5792 37.0 6660 0.6819 0.8264
0.5739 38.0 6840 0.6780 0.8210
0.5698 39.0 7020 0.6684 0.8283
0.5602 40.0 7200 0.6692 0.8249
0.5534 41.0 7380 0.6644 0.8298
0.5429 42.0 7560 0.6599 0.8278
0.5423 43.0 7740 0.6585 0.8308
0.5356 44.0 7920 0.6569 0.8293
0.5374 45.0 8100 0.6565 0.8293
0.5327 46.0 8280 0.6540 0.8273
0.5324 47.0 8460 0.6523 0.8273
0.5281 48.0 8640 0.6519 0.8283

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
51
Safetensors
Model size
4.21M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for gokuls/hbertv1-massive-logit_KD-tiny_ffn_0.5

Evaluation results