Edit model card

hbertv1-massive-intermediate_KD_new_2

This model is a fine-tuned version of gokuls/HBERTv1_48_L10_H768_A12 on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3102
  • Accuracy: 0.8342

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 33
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.5836 1.0 180 3.4660 0.2710
3.38 2.0 360 2.7802 0.4324
2.7571 3.0 540 2.3906 0.5991
2.3743 4.0 720 2.1148 0.7029
2.1481 5.0 900 2.0007 0.7245
1.9762 6.0 1080 1.9660 0.7467
1.8702 7.0 1260 1.8680 0.7619
1.759 8.0 1440 1.8192 0.7806
1.6949 9.0 1620 1.7677 0.7949
1.6253 10.0 1800 1.7452 0.7885
1.5849 11.0 1980 1.7075 0.8023
1.5239 12.0 2160 1.6915 0.7939
1.4768 13.0 2340 1.6821 0.8067
1.4474 14.0 2520 1.7201 0.7944
1.424 15.0 2700 1.6538 0.8096
1.3839 16.0 2880 1.5979 0.8141
1.3537 17.0 3060 1.6254 0.8062
1.3422 18.0 3240 1.6386 0.8077
1.3166 19.0 3420 1.6048 0.8141
1.2923 20.0 3600 1.5927 0.8146
1.2722 21.0 3780 1.5544 0.8180
1.2513 22.0 3960 1.5904 0.8077
1.2286 23.0 4140 1.5506 0.8195
1.2056 24.0 4320 1.5547 0.8146
1.1941 25.0 4500 1.5258 0.8224
1.1701 26.0 4680 1.4975 0.8224
1.1582 27.0 4860 1.4945 0.8200
1.1367 28.0 5040 1.4888 0.8219
1.127 29.0 5220 1.4596 0.8254
1.1126 30.0 5400 1.4686 0.8175
1.0922 31.0 5580 1.4934 0.8200
1.0809 32.0 5760 1.4370 0.8249
1.0715 33.0 5940 1.4305 0.8234
1.0572 34.0 6120 1.4255 0.8273
1.0429 35.0 6300 1.4042 0.8249
1.0375 36.0 6480 1.4004 0.8190
1.0242 37.0 6660 1.3849 0.8269
1.0132 38.0 6840 1.3777 0.8288
1.0085 39.0 7020 1.3731 0.8273
0.9964 40.0 7200 1.3647 0.8278
0.9867 41.0 7380 1.3655 0.8239
0.9787 42.0 7560 1.3542 0.8293
0.9692 43.0 7740 1.3449 0.8278
0.9646 44.0 7920 1.3402 0.8283
0.959 45.0 8100 1.3360 0.8288
0.9482 46.0 8280 1.3289 0.8303
0.9503 47.0 8460 1.3173 0.8328
0.9428 48.0 8640 1.3152 0.8333
0.9416 49.0 8820 1.3102 0.8342
0.9348 50.0 9000 1.3133 0.8328

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
6
Safetensors
Model size
104M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for gokuls/hbertv1-massive-intermediate_KD_new_2

Finetuned
(4)
this model

Evaluation results