gokuls
/

hBERTv1_new_pretrain_w_init_48_ver2_stsb

+---
+base_model: gokuls/bert_12_layer_model_v1_complete_training_new_wt_init_48
+tags:
+- generated_from_trainer
+datasets:
+- glue
+metrics:
+- spearmanr
+model-index:
+- name: hBERTv1_new_pretrain_w_init_48_ver2_stsb
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: glue
+      type: glue
+      config: stsb
+      split: validation
+      args: stsb
+    metrics:
+    - name: Spearmanr
+      type: spearmanr
+      value: 0.20585571946536435
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# hBERTv1_new_pretrain_w_init_48_ver2_stsb
+This model is a fine-tuned version of [gokuls/bert_12_layer_model_v1_complete_training_new_wt_init_48](https://huggingface.co/gokuls/bert_12_layer_model_v1_complete_training_new_wt_init_48) on the glue dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.3776
+- Pearson: 0.1911
+- Spearmanr: 0.2059
+- Combined Score: 0.1985
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 4e-05
+- train_batch_size: 64
+- eval_batch_size: 64
+- seed: 10
+- distributed_type: multi-GPU
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 15
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Pearson | Spearmanr | Combined Score |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:---------:|:--------------:|
+| 2.3716        | 1.0   | 90   | 2.4198          | 0.1235  | 0.0756    | 0.0995         |
+| 2.1648        | 2.0   | 180  | 2.4218          | 0.0592  | 0.0606    | 0.0599         |
+| 2.1915        | 3.0   | 270  | 2.5305          | 0.1143  | 0.0959    | 0.1051         |
+| 2.1855        | 4.0   | 360  | 2.4912          | 0.1118  | 0.0969    | 0.1043         |
+| 2.1858        | 5.0   | 450  | 2.3539          | 0.1130  | 0.1043    | 0.1087         |
+| 2.1818        | 6.0   | 540  | 2.2509          | 0.1285  | 0.1247    | 0.1266         |
+| 2.2562        | 7.0   | 630  | 2.3302          | 0.1043  | 0.0974    | 0.1009         |
+| 2.2299        | 8.0   | 720  | 2.3749          | 0.1984  | 0.1422    | 0.1703         |
+| 2.0676        | 9.0   | 810  | 2.3883          | 0.1300  | 0.1329    | 0.1314         |
+| 1.926         | 10.0  | 900  | 2.5884          | 0.1259  | 0.1233    | 0.1246         |
+| 1.7701        | 11.0  | 990  | 2.3776          | 0.1911  | 0.2059    | 0.1985         |
+### Framework versions
+- Transformers 4.34.0
+- Pytorch 1.14.0a0+410ce96
+- Datasets 2.14.5
+- Tokenizers 0.14.1

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ee127366f4863e1f38b99b44a5cf38cd60bf9cabf16f89400c806c7f601faedf
 size 481096157

 version https://git-lfs.github.com/spec/v1
+oid sha256:08a841606bf0b4aad4062e90c7b00be9ce5564154e227e0f1c7fb05342d36860
 size 481096157