malduwais's picture
End of training
a40094a verified
metadata
license: mit
base_model: xlm-roberta-base
tags:
  - generated_from_trainer
model-index:
  - name: xlm-roberta-base-finetuned-ANAD-mlm-0.15-base-25OCT
    results: []

xlm-roberta-base-finetuned-ANAD-mlm-0.15-base-25OCT

This model is a fine-tuned version of xlm-roberta-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5193

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
No log 0.0941 100 1.9039
No log 0.1881 200 1.8793
No log 0.2822 300 1.8643
No log 0.3763 400 1.8479
2.0696 0.4703 500 1.8380
2.0696 0.5644 600 1.8336
2.0696 0.6585 700 1.8226
2.0696 0.7525 800 1.8231
2.0696 0.8466 900 1.8136
2.0049 0.9407 1000 1.8161
2.0049 1.0347 1100 1.8056
2.0049 1.1288 1200 1.7934
2.0049 1.2229 1300 1.7887
2.0049 1.3169 1400 1.7749
1.9612 1.4110 1500 1.7726
1.9612 1.5051 1600 1.7679
1.9612 1.5992 1700 1.7543
1.9612 1.6932 1800 1.7473
1.9612 1.7873 1900 1.7413
1.911 1.8814 2000 1.7334
1.911 1.9754 2100 1.7302
1.911 2.0695 2200 1.7172
1.911 2.1636 2300 1.7187
1.911 2.2576 2400 1.7076
1.8628 2.3517 2500 1.7011
1.8628 2.4458 2600 1.7001
1.8628 2.5398 2700 1.6929
1.8628 2.6339 2800 1.6929
1.8628 2.7280 2900 1.6848
1.8328 2.8220 3000 1.6804
1.8328 2.9161 3100 1.6762
1.8328 3.0102 3200 1.6759
1.8328 3.1042 3300 1.6715
1.8328 3.1983 3400 1.6653
1.8018 3.2924 3500 1.6590
1.8018 3.3864 3600 1.6519
1.8018 3.4805 3700 1.6493
1.8018 3.5746 3800 1.6458
1.8018 3.6686 3900 1.6415
1.7708 3.7627 4000 1.6397
1.7708 3.8568 4100 1.6345
1.7708 3.9508 4200 1.6351
1.7708 4.0449 4300 1.6324
1.7708 4.1390 4400 1.6271
1.7501 4.2331 4500 1.6253
1.7501 4.3271 4600 1.6248
1.7501 4.4212 4700 1.6153
1.7501 4.5153 4800 1.6191
1.7501 4.6093 4900 1.6135
1.7283 4.7034 5000 1.6087
1.7283 4.7975 5100 1.6072
1.7283 4.8915 5200 1.5991
1.7283 4.9856 5300 1.6026
1.7283 5.0797 5400 1.5989
1.7105 5.1737 5500 1.6011
1.7105 5.2678 5600 1.5958
1.7105 5.3619 5700 1.5894
1.7105 5.4559 5800 1.5871
1.7105 5.5500 5900 1.5865
1.6816 5.6441 6000 1.5871
1.6816 5.7381 6100 1.5840
1.6816 5.8322 6200 1.5842
1.6816 5.9263 6300 1.5772
1.6816 6.0203 6400 1.5769
1.6745 6.1144 6500 1.5740
1.6745 6.2085 6600 1.5690
1.6745 6.3025 6700 1.5700
1.6745 6.3966 6800 1.5704
1.6745 6.4907 6900 1.5667
1.6639 6.5847 7000 1.5653
1.6639 6.6788 7100 1.5647
1.6639 6.7729 7200 1.5625
1.6639 6.8670 7300 1.5572
1.6639 6.9610 7400 1.5551
1.6509 7.0551 7500 1.5533
1.6509 7.1492 7600 1.5522
1.6509 7.2432 7700 1.5509
1.6509 7.3373 7800 1.5468
1.6509 7.4314 7900 1.5488
1.6344 7.5254 8000 1.5459
1.6344 7.6195 8100 1.5463
1.6344 7.7136 8200 1.5452
1.6344 7.8076 8300 1.5407
1.6344 7.9017 8400 1.5416
1.6281 7.9958 8500 1.5400
1.6281 8.0898 8600 1.5372
1.6281 8.1839 8700 1.5350
1.6281 8.2780 8800 1.5341
1.6281 8.3720 8900 1.5345
1.6132 8.4661 9000 1.5325
1.6132 8.5602 9100 1.5293
1.6132 8.6542 9200 1.5288
1.6132 8.7483 9300 1.5280
1.6132 8.8424 9400 1.5287
1.6123 8.9364 9500 1.5272
1.6123 9.0305 9600 1.5255
1.6123 9.1246 9700 1.5251
1.6123 9.2186 9800 1.5233
1.6123 9.3127 9900 1.5221
1.5993 9.4068 10000 1.5223
1.5993 9.5009 10100 1.5216
1.5993 9.5949 10200 1.5215
1.5993 9.6890 10300 1.5207
1.5993 9.7831 10400 1.5204
1.5959 9.8771 10500 1.5198
1.5959 9.9712 10600 1.5193

Framework versions

  • Transformers 4.43.4
  • Pytorch 2.1.1+cu121
  • Datasets 3.0.2
  • Tokenizers 0.19.1