gokuls
/

hBERTv1_new_pretrain_w_init_48_stsb

+---
+tags:
+- generated_from_trainer
+datasets:
+- glue
+metrics:
+- spearmanr
+model-index:
+- name: hBERTv1_new_pretrain_w_init_48_stsb
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: glue
+      type: glue
+      config: stsb
+      split: validation
+      args: stsb
+    metrics:
+    - name: Spearmanr
+      type: spearmanr
+      value: 0.750517182024731
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# hBERTv1_new_pretrain_w_init_48_stsb
+This model is a fine-tuned version of [gokuls/bert_12_layer_model_v1_complete_training_new_wt_init_48](https://huggingface.co/gokuls/bert_12_layer_model_v1_complete_training_new_wt_init_48) on the glue dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.0470
+- Pearson: 0.7517
+- Spearmanr: 0.7505
+- Combined Score: 0.7511
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 4e-05
+- train_batch_size: 128
+- eval_batch_size: 128
+- seed: 10
+- distributed_type: multi-GPU
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 50
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Pearson | Spearmanr | Combined Score |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:---------:|:--------------:|
+| 2.5456        | 1.0   | 45   | 2.2706          | 0.1246  | 0.1141    | 0.1194         |
+| 2.0514        | 2.0   | 90   | 2.0613          | 0.5266  | 0.5198    | 0.5232         |
+| 1.3837        | 3.0   | 135  | 1.1984          | 0.6853  | 0.6942    | 0.6897         |
+| 1.0297        | 4.0   | 180  | 1.6176          | 0.6869  | 0.6961    | 0.6915         |
+| 0.8064        | 5.0   | 225  | 1.1444          | 0.7476  | 0.7445    | 0.7460         |
+| 0.604         | 6.0   | 270  | 1.2754          | 0.7422  | 0.7450    | 0.7436         |
+| 0.4818        | 7.0   | 315  | 1.1407          | 0.7687  | 0.7673    | 0.7680         |
+| 0.3905        | 8.0   | 360  | 1.1860          | 0.7560  | 0.7604    | 0.7582         |
+| 0.3476        | 9.0   | 405  | 0.9800          | 0.7515  | 0.7472    | 0.7493         |
+| 0.2819        | 10.0  | 450  | 1.0156          | 0.7521  | 0.7507    | 0.7514         |
+| 0.2418        | 11.0  | 495  | 1.0174          | 0.7516  | 0.7480    | 0.7498         |
+| 0.2068        | 12.0  | 540  | 1.2367          | 0.7530  | 0.7523    | 0.7527         |
+| 0.1863        | 13.0  | 585  | 1.0073          | 0.7491  | 0.7468    | 0.7480         |
+| 0.1929        | 14.0  | 630  | 1.0470          | 0.7517  | 0.7505    | 0.7511         |
+### Framework versions
+- Transformers 4.29.2
+- Pytorch 1.14.0a0+410ce96
+- Datasets 2.12.0
+- Tokenizers 0.13.3