gokuls's picture
Update README.md
8864f1e
metadata
base_model: gokuls/model_v1_complete_training_wt_init_48_tiny_freeze_new_ffn_0.5
tags:
  - generated_from_trainer
datasets:
  - massive
metrics:
  - accuracy
model-index:
  - name: hbertv1-massive-logit_KD-tiny_ffn_0.5
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: massive
          type: massive
          config: en-US
          split: validation
          args: en-US
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.8407919331037875

hbertv1-massive-logit_KD-tiny_ffn_0.5

This model is a fine-tuned version of gokuls/model_v1_complete_training_wt_init_48_tiny_freeze_new_ffn_0.5 on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6585
  • Accuracy: 0.8308

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 33
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.3023 1.0 180 3.8354 0.1766
3.7037 2.0 360 3.2686 0.2027
3.2011 3.0 540 2.8012 0.2966
2.774 4.0 720 2.4055 0.3802
2.4069 5.0 900 2.0833 0.4747
2.1164 6.0 1080 1.8300 0.5588
1.8907 7.0 1260 1.6351 0.6252
1.71 8.0 1440 1.4792 0.6621
1.5648 9.0 1620 1.3605 0.6936
1.4399 10.0 1800 1.2607 0.7103
1.3436 11.0 1980 1.1872 0.7201
1.266 12.0 2160 1.1295 0.7285
1.1934 13.0 2340 1.0829 0.7359
1.1413 14.0 2520 1.0428 0.7472
1.0807 15.0 2700 0.9984 0.7585
1.0382 16.0 2880 0.9693 0.7600
0.9982 17.0 3060 0.9439 0.7673
0.9626 18.0 3240 0.9207 0.7723
0.9299 19.0 3420 0.8887 0.7796
0.8828 20.0 3600 0.8686 0.7796
0.8593 21.0 3780 0.8537 0.7905
0.8329 22.0 3960 0.8250 0.7934
0.8043 23.0 4140 0.8098 0.7959
0.7764 24.0 4320 0.7990 0.8008
0.7569 25.0 4500 0.7823 0.8067
0.7372 26.0 4680 0.7749 0.8023
0.7182 27.0 4860 0.7640 0.8101
0.6987 28.0 5040 0.7509 0.8106
0.6842 29.0 5220 0.7386 0.8146
0.6673 30.0 5400 0.7305 0.8146
0.6509 31.0 5580 0.7196 0.8214
0.6382 32.0 5760 0.7120 0.8170
0.6301 33.0 5940 0.7134 0.8190
0.6139 34.0 6120 0.7062 0.8200
0.6076 35.0 6300 0.6928 0.8205
0.5919 36.0 6480 0.6838 0.8244
0.5792 37.0 6660 0.6819 0.8264
0.5739 38.0 6840 0.6780 0.8210
0.5698 39.0 7020 0.6684 0.8283
0.5602 40.0 7200 0.6692 0.8249
0.5534 41.0 7380 0.6644 0.8298
0.5429 42.0 7560 0.6599 0.8278
0.5423 43.0 7740 0.6585 0.8308
0.5356 44.0 7920 0.6569 0.8293
0.5374 45.0 8100 0.6565 0.8293
0.5327 46.0 8280 0.6540 0.8273
0.5324 47.0 8460 0.6523 0.8273
0.5281 48.0 8640 0.6519 0.8283

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.15.0
  • Tokenizers 0.15.0