jamesngai's picture
End of training
0679321
|
raw
history blame
5.32 kB
metadata
license: mit
base_model: facebook/xlm-roberta-xl
tags:
  - generated_from_trainer
metrics:
  - precision
  - recall
  - f1
  - accuracy
model-index:
  - name: xlm-roberta-xl-final-lora500
    results: []

xlm-roberta-xl-final-lora500

This model is a fine-tuned version of facebook/xlm-roberta-xl on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5378
  • Precision: 0.9334
  • Recall: 0.9341
  • F1: 0.9337
  • Accuracy: 0.9421

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 40
  • num_epochs: 40
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.2

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
2.6949 1.0 250 1.9846 0.7571 0.8198 0.7872 0.8265
1.8141 2.0 500 1.6856 0.8709 0.8824 0.8766 0.8938
1.6277 3.0 750 1.6081 0.8881 0.9011 0.8945 0.9122
1.5464 4.0 1000 1.5735 0.9004 0.9064 0.9034 0.9201
1.4908 5.0 1250 1.5482 0.9111 0.9145 0.9128 0.9274
1.4599 6.0 1500 1.5386 0.9096 0.9175 0.9135 0.9282
1.4382 7.0 1750 1.5396 0.9175 0.9204 0.9189 0.9292
1.422 8.0 2000 1.5394 0.9163 0.9212 0.9188 0.9305
1.4053 9.0 2250 1.5354 0.9240 0.9223 0.9231 0.9335
1.3949 10.0 2500 1.5424 0.9155 0.9230 0.9192 0.9308
1.3858 11.0 2750 1.5405 0.9202 0.9248 0.9225 0.9313
1.379 12.0 3000 1.5364 0.9186 0.9263 0.9224 0.9339
1.3715 13.0 3250 1.5310 0.9263 0.9275 0.9269 0.9373
1.3647 14.0 3500 1.5321 0.9221 0.9273 0.9247 0.9355
1.3592 15.0 3750 1.5347 0.9277 0.9261 0.9269 0.9372
1.3564 16.0 4000 1.5323 0.9229 0.9269 0.9249 0.9371
1.3524 17.0 4250 1.5339 0.9232 0.9248 0.9240 0.9347
1.3512 18.0 4500 1.5425 0.9262 0.9284 0.9273 0.9370
1.3482 19.0 4750 1.5387 0.9238 0.9299 0.9268 0.9362
1.3437 20.0 5000 1.5334 0.9267 0.9324 0.9295 0.9389
1.3414 21.0 5250 1.5379 0.9302 0.9283 0.9292 0.9394
1.3408 22.0 5500 1.5394 0.9256 0.9291 0.9273 0.9381
1.3401 23.0 5750 1.5376 0.9320 0.9301 0.9310 0.9391
1.3388 24.0 6000 1.5381 0.9300 0.9300 0.9300 0.9383
1.3379 25.0 6250 1.5402 0.9247 0.9309 0.9278 0.9380
1.3361 26.0 6500 1.5415 0.9303 0.9275 0.9289 0.9383
1.3349 27.0 6750 1.5391 0.9305 0.9300 0.9302 0.9402
1.3338 28.0 7000 1.5379 0.9296 0.9290 0.9293 0.9392
1.3337 29.0 7250 1.5438 0.9286 0.9309 0.9297 0.9388
1.3329 30.0 7500 1.5388 0.9325 0.9310 0.9318 0.9410
1.3321 31.0 7750 1.5443 0.9319 0.9314 0.9317 0.9408
1.3319 32.0 8000 1.5413 0.9317 0.9334 0.9325 0.9415
1.3313 33.0 8250 1.5428 0.9329 0.9332 0.9331 0.9413
1.3309 34.0 8500 1.5452 0.9288 0.9317 0.9302 0.9396
1.3308 35.0 8750 1.5382 0.9307 0.9324 0.9315 0.9410
1.3307 36.0 9000 1.5370 0.9314 0.9334 0.9324 0.9413
1.33 37.0 9250 1.5391 0.9321 0.9328 0.9325 0.9414
1.3297 38.0 9500 1.5386 0.9330 0.9335 0.9333 0.9414
1.3293 39.0 9750 1.5378 0.9336 0.9343 0.9340 0.9420
1.3294 40.0 10000 1.5378 0.9334 0.9341 0.9337 0.9421

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0