pszemraj
/

paraphrase-MiniLM-L12-v2-CoLA

+---
+license: apache-2.0
+tags:
+- generated_from_trainer
+datasets:
+- glue
+metrics:
+- matthews_correlation
+model-index:
+- name: paraphrase-MiniLM-L12-v2-CoLA
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: glue
+      type: glue
+      config: cola
+      split: validation
+      args: cola
+    metrics:
+    - name: Matthews Correlation
+      type: matthews_correlation
+      value: 0.49464326454019025
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# paraphrase-MiniLM-L12-v2-CoLA
+This model is a fine-tuned version of [sentence-transformers/paraphrase-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L12-v2) on the glue dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.9375
+- Matthews Correlation: 0.4946
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 8e-05
+- train_batch_size: 64
+- eval_batch_size: 16
+- seed: 30198
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 128
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.03
+- num_epochs: 16.0
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Matthews Correlation |
+|:-------------:|:-----:|:----:|:---------------:|:--------------------:|
+| 0.5747        | 1.0   | 67   | 0.5394          | 0.3455               |
+| 0.5025        | 2.0   | 134  | 0.4999          | 0.4270               |
+| 0.3698        | 3.0   | 201  | 0.4636          | 0.5057               |
+| 0.2969        | 4.0   | 268  | 0.5309          | 0.4751               |
+| 0.2275        | 5.0   | 335  | 0.6238          | 0.4775               |
+| 0.1859        | 6.0   | 402  | 0.6315          | 0.4867               |
+| 0.1517        | 7.0   | 469  | 0.7783          | 0.4695               |
+| 0.1016        | 8.0   | 536  | 0.6762          | 0.4901               |
+| 0.1017        | 9.0   | 603  | 0.7412          | 0.5046               |
+| 0.0898        | 10.0  | 670  | 0.7719          | 0.4877               |
+| 0.0527        | 11.0  | 737  | 0.8627          | 0.4955               |
+| 0.0582        | 12.0  | 804  | 0.8986          | 0.4738               |
+| 0.074         | 13.0  | 871  | 0.9469          | 0.4942               |
+| 0.0508        | 14.0  | 938  | 0.9436          | 0.4918               |
+| 0.024         | 15.0  | 1005 | 0.9391          | 0.4919               |
+| 0.0458        | 16.0  | 1072 | 0.9375          | 0.4946               |
+### Framework versions
+- Transformers 4.27.0.dev0
+- Pytorch 1.13.1+cu117
+- Datasets 2.8.0
+- Tokenizers 0.13.1