metadata

base_model: gokuls/model_v1_complete_training_wt_init_48_tiny_freeze_new_ffn_0.5
tags:
  - generated_from_trainer
datasets:
  - massive
metrics:
  - accuracy
model-index:
  - name: hbertv1-massive-logit_KD-tiny_ffn_0.5
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: massive
          type: massive
          config: en-US
          split: validation
          args: en-US
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.8407919331037875

hbertv1-massive-logit_KD-tiny_ffn_0.5

This model is a fine-tuned version of gokuls/model_v1_complete_training_wt_init_48_tiny_freeze_new_ffn_0.5 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 0.6585
Accuracy: 0.8308

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 64
eval_batch_size: 64
seed: 33
distributed_type: multi-GPU
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
4.3023	1.0	180	3.8354	0.1766
3.7037	2.0	360	3.2686	0.2027
3.2011	3.0	540	2.8012	0.2966
2.774	4.0	720	2.4055	0.3802
2.4069	5.0	900	2.0833	0.4747
2.1164	6.0	1080	1.8300	0.5588
1.8907	7.0	1260	1.6351	0.6252
1.71	8.0	1440	1.4792	0.6621
1.5648	9.0	1620	1.3605	0.6936
1.4399	10.0	1800	1.2607	0.7103
1.3436	11.0	1980	1.1872	0.7201
1.266	12.0	2160	1.1295	0.7285
1.1934	13.0	2340	1.0829	0.7359
1.1413	14.0	2520	1.0428	0.7472
1.0807	15.0	2700	0.9984	0.7585
1.0382	16.0	2880	0.9693	0.7600
0.9982	17.0	3060	0.9439	0.7673
0.9626	18.0	3240	0.9207	0.7723
0.9299	19.0	3420	0.8887	0.7796
0.8828	20.0	3600	0.8686	0.7796
0.8593	21.0	3780	0.8537	0.7905
0.8329	22.0	3960	0.8250	0.7934
0.8043	23.0	4140	0.8098	0.7959
0.7764	24.0	4320	0.7990	0.8008
0.7569	25.0	4500	0.7823	0.8067
0.7372	26.0	4680	0.7749	0.8023
0.7182	27.0	4860	0.7640	0.8101
0.6987	28.0	5040	0.7509	0.8106
0.6842	29.0	5220	0.7386	0.8146
0.6673	30.0	5400	0.7305	0.8146
0.6509	31.0	5580	0.7196	0.8214
0.6382	32.0	5760	0.7120	0.8170
0.6301	33.0	5940	0.7134	0.8190
0.6139	34.0	6120	0.7062	0.8200
0.6076	35.0	6300	0.6928	0.8205
0.5919	36.0	6480	0.6838	0.8244
0.5792	37.0	6660	0.6819	0.8264
0.5739	38.0	6840	0.6780	0.8210
0.5698	39.0	7020	0.6684	0.8283
0.5602	40.0	7200	0.6692	0.8249
0.5534	41.0	7380	0.6644	0.8298
0.5429	42.0	7560	0.6599	0.8278
0.5423	43.0	7740	0.6585	0.8308
0.5356	44.0	7920	0.6569	0.8293
0.5374	45.0	8100	0.6565	0.8293
0.5327	46.0	8280	0.6540	0.8273
0.5324	47.0	8460	0.6523	0.8273
0.5281	48.0	8640	0.6519	0.8283

Framework versions

Transformers 4.35.2
Pytorch 1.14.0a0+410ce96
Datasets 2.15.0
Tokenizers 0.15.0