bart-base-wsd-finetuned-cve-reason
This model is a fine-tuned version of facebook/bart-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3236
- Rouge1: 90.5086
- Rouge2: 86.7313
- Rougel: 90.5004
- Rougelsum: 90.4025
- Gen Len: 8.5902
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 200
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 56 | 0.5785 | 70.2552 | 61.7586 | 70.3201 | 70.3702 | 8.0328 |
No log | 2.0 | 112 | 0.4143 | 85.2974 | 79.9312 | 85.3423 | 85.3688 | 8.4295 |
No log | 3.0 | 168 | 0.3903 | 85.4657 | 78.0399 | 85.0825 | 85.0315 | 8.518 |
No log | 4.0 | 224 | 0.3799 | 82.3413 | 78.0306 | 82.3002 | 82.1323 | 8.3213 |
No log | 5.0 | 280 | 0.3536 | 86.8229 | 81.6826 | 86.6938 | 86.7128 | 8.5246 |
No log | 6.0 | 336 | 0.3583 | 88.3834 | 83.6765 | 88.3687 | 88.3368 | 8.4164 |
No log | 7.0 | 392 | 0.3474 | 87.6783 | 84.0721 | 87.6311 | 87.5552 | 8.4885 |
No log | 8.0 | 448 | 0.3674 | 88.1823 | 83.7787 | 88.1658 | 88.0453 | 8.6656 |
0.3758 | 9.0 | 504 | 0.3357 | 89.3687 | 85.4151 | 89.2735 | 89.1779 | 8.5377 |
0.3758 | 10.0 | 560 | 0.3666 | 89.2611 | 85.8911 | 89.3461 | 89.2438 | 8.7902 |
0.3758 | 11.0 | 616 | 0.3650 | 88.4002 | 84.0876 | 88.4319 | 88.3324 | 8.7639 |
0.3758 | 12.0 | 672 | 0.3381 | 89.8928 | 86.2751 | 89.9706 | 89.891 | 8.741 |
0.3758 | 13.0 | 728 | 0.3236 | 90.5086 | 86.7313 | 90.5004 | 90.4025 | 8.5902 |
0.3758 | 14.0 | 784 | 0.3577 | 89.6929 | 85.2464 | 89.4044 | 89.2693 | 8.5115 |
0.3758 | 15.0 | 840 | 0.3414 | 87.0953 | 83.2736 | 86.9541 | 87.0706 | 8.5902 |
0.3758 | 16.0 | 896 | 0.3636 | 89.0054 | 85.0881 | 89.0154 | 88.8735 | 8.6885 |
0.3758 | 17.0 | 952 | 0.3596 | 89.6327 | 86.0865 | 89.6939 | 89.624 | 8.7049 |
0.1003 | 18.0 | 1008 | 0.3286 | 89.5349 | 85.7598 | 89.5881 | 89.5125 | 8.5934 |
0.1003 | 19.0 | 1064 | 0.3573 | 89.3753 | 85.6797 | 89.3238 | 89.1992 | 8.6361 |
0.1003 | 20.0 | 1120 | 0.3589 | 90.3086 | 86.7555 | 90.2283 | 90.1314 | 8.6492 |
0.1003 | 21.0 | 1176 | 0.3500 | 89.9113 | 84.7301 | 89.8777 | 89.8271 | 8.5246 |
0.1003 | 22.0 | 1232 | 0.3738 | 90.6328 | 86.8572 | 90.653 | 90.5831 | 8.6492 |
0.1003 | 23.0 | 1288 | 0.3446 | 90.8409 | 86.7153 | 90.8496 | 90.8431 | 8.5279 |
Framework versions
- Transformers 4.42.3
- Pytorch 2.3.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.