roberta-large_ALL_BCE_translated_data_multihead_19_shuffled_special_tokens_final
This model is a fine-tuned version of roberta-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4623
- F1 Macro 0.1: 0.1265
- F1 Macro 0.15: 0.1685
- F1 Macro 0.2: 0.2066
- F1 Macro 0.25: 0.2414
- F1 Macro 0.3: 0.2734
- F1 Macro 0.35: 0.3042
- F1 Macro 0.4: 0.3334
- F1 Macro 0.45: 0.3608
- F1 Macro 0.5: 0.3883
- F1 Macro 0.55: 0.4138
- F1 Macro 0.6: 0.4399
- F1 Macro 0.65: 0.4650
- F1 Macro 0.7: 0.4898
- F1 Macro 0.75: 0.5148
- F1 Macro 0.8: 0.5380
- F1 Macro 0.85: 0.5589
- F1 Macro 0.9: 0.5675
- F1 Macro 0.95: 0.5240
- Threshold 0: 0.9
- Threshold 1: 0.85
- Threshold 2: 0.9
- Threshold 3: 0.95
- Threshold 4: 0.85
- Threshold 5: 0.85
- Threshold 6: 0.9
- Threshold 7: 0.9
- Threshold 8: 0.9
- Threshold 9: 0.8
- Threshold 10: 0.95
- Threshold 11: 0.85
- Threshold 12: 0.9
- Threshold 13: 0.95
- Threshold 14: 0.9
- Threshold 15: 0.9
- Threshold 16: 0.9
- Threshold 17: 0.95
- Threshold 18: 0.95
- 0: 0.4528
- 1: 0.4559
- 2: 0.5380
- 3: 0.6832
- 4: 0.5859
- 5: 0.5963
- 6: 0.6048
- 7: 0.5750
- 8: 0.6180
- 9: 0.6267
- 10: 0.7001
- 11: 0.6508
- 12: 0.4796
- 13: 0.3492
- 14: 0.5842
- 15: 0.5477
- 16: 0.5863
- 17: 0.7601
- 18: 0.5831
- Max F1: 0.5675
- Mean F1: 0.5778
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 8
- eval_batch_size: 8
- seed: 2024
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | F1 Macro 0.1 | F1 Macro 0.15 | F1 Macro 0.2 | F1 Macro 0.25 | F1 Macro 0.3 | F1 Macro 0.35 | F1 Macro 0.4 | F1 Macro 0.45 | F1 Macro 0.5 | F1 Macro 0.55 | F1 Macro 0.6 | F1 Macro 0.65 | F1 Macro 0.7 | F1 Macro 0.75 | F1 Macro 0.8 | F1 Macro 0.85 | F1 Macro 0.9 | F1 Macro 0.95 | Threshold 0 | Threshold 1 | Threshold 2 | Threshold 3 | Threshold 4 | Threshold 5 | Threshold 6 | Threshold 7 | Threshold 8 | Threshold 9 | Threshold 10 | Threshold 11 | Threshold 12 | Threshold 13 | Threshold 14 | Threshold 15 | Threshold 16 | Threshold 17 | Threshold 18 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | Max F1 | Mean F1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1.0816 | 1.0 | 7458 | 0.7590 | 0.0866 | 0.1168 | 0.1480 | 0.1784 | 0.2078 | 0.2358 | 0.2614 | 0.2862 | 0.3099 | 0.3346 | 0.3563 | 0.3779 | 0.3945 | 0.4110 | 0.4162 | 0.4117 | 0.3749 | 0.2469 | 0.75 | 0.75 | 0.8 | 0.8 | 0.75 | 0.85 | 0.85 | 0.8 | 0.85 | 0.75 | 0.9 | 0.8 | 0.85 | 0.75 | 0.9 | 0.85 | 0.75 | 0.9 | 0.8 | 0.2220 | 0.2864 | 0.3679 | 0.4359 | 0.4629 | 0.4666 | 0.5148 | 0.3813 | 0.4581 | 0.5339 | 0.5427 | 0.5521 | 0.3009 | 0.2251 | 0.4403 | 0.3585 | 0.4618 | 0.6699 | 0.4129 | 0.4162 | 0.4260 |
0.7345 | 2.0 | 14916 | 0.5285 | 0.1075 | 0.1419 | 0.1746 | 0.2053 | 0.2342 | 0.2613 | 0.2880 | 0.3140 | 0.3397 | 0.3643 | 0.3895 | 0.4136 | 0.4384 | 0.4637 | 0.4882 | 0.5089 | 0.5194 | 0.4806 | 0.9 | 0.85 | 0.9 | 0.95 | 0.85 | 0.9 | 0.9 | 0.9 | 0.9 | 0.8 | 0.95 | 0.9 | 0.9 | 0.95 | 0.9 | 0.95 | 0.9 | 0.95 | 0.95 | 0.3721 | 0.4047 | 0.4824 | 0.6195 | 0.5458 | 0.5504 | 0.5722 | 0.5126 | 0.5609 | 0.6075 | 0.6467 | 0.6181 | 0.4059 | 0.3256 | 0.5522 | 0.4702 | 0.5553 | 0.7329 | 0.5234 | 0.5194 | 0.5294 |
0.5782 | 3.0 | 22374 | 0.4623 | 0.1265 | 0.1685 | 0.2066 | 0.2414 | 0.2734 | 0.3042 | 0.3334 | 0.3608 | 0.3883 | 0.4138 | 0.4399 | 0.4650 | 0.4898 | 0.5148 | 0.5380 | 0.5589 | 0.5675 | 0.5240 | 0.9 | 0.85 | 0.9 | 0.95 | 0.85 | 0.85 | 0.9 | 0.9 | 0.9 | 0.8 | 0.95 | 0.85 | 0.9 | 0.95 | 0.9 | 0.9 | 0.9 | 0.95 | 0.95 | 0.4528 | 0.4559 | 0.5380 | 0.6832 | 0.5859 | 0.5963 | 0.6048 | 0.5750 | 0.6180 | 0.6267 | 0.7001 | 0.6508 | 0.4796 | 0.3492 | 0.5842 | 0.5477 | 0.5863 | 0.7601 | 0.5831 | 0.5675 | 0.5778 |
Framework versions
- Transformers 4.39.3
- Pytorch 2.2.2+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 0
Model tree for SotirisLegkas/roberta-large_ALL_BCE_translated_data_multihead_19_shuffled_special_tokens_final
Base model
FacebookAI/roberta-large