rule_learning_1mm_many_negatives_spanpred_margin_avg
This model is a fine-tuned version of enoriega/rule_softmatching on the enoriega/odinsynth_dataset dataset. It achieves the following results on the evaluation set:
- Loss: 0.2421
- Margin Accuracy: 0.8897
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2000
- total_train_batch_size: 8000
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Margin Accuracy |
---|---|---|---|---|
0.3867 | 0.16 | 20 | 0.4023 | 0.8187 |
0.3506 | 0.32 | 40 | 0.3381 | 0.8523 |
0.3195 | 0.48 | 60 | 0.3096 | 0.8613 |
0.3052 | 0.64 | 80 | 0.2957 | 0.8640 |
0.2859 | 0.8 | 100 | 0.2922 | 0.8679 |
0.297 | 0.96 | 120 | 0.2871 | 0.8688 |
0.2717 | 1.12 | 140 | 0.2761 | 0.8732 |
0.2671 | 1.28 | 160 | 0.2751 | 0.8743 |
0.2677 | 1.44 | 180 | 0.2678 | 0.8757 |
0.2693 | 1.6 | 200 | 0.2627 | 0.8771 |
0.2675 | 1.76 | 220 | 0.2573 | 0.8813 |
0.2732 | 1.92 | 240 | 0.2546 | 0.8858 |
0.246 | 2.08 | 260 | 0.2478 | 0.8869 |
0.2355 | 2.24 | 280 | 0.2463 | 0.8871 |
0.2528 | 2.4 | 300 | 0.2449 | 0.8886 |
0.2512 | 2.56 | 320 | 0.2443 | 0.8892 |
0.2527 | 2.72 | 340 | 0.2441 | 0.8893 |
0.2346 | 2.88 | 360 | 0.2424 | 0.8895 |
Framework versions
- Transformers 4.19.2
- Pytorch 1.11.0
- Datasets 2.2.1
- Tokenizers 0.12.1
- Downloads last month
- 5