simonycl's picture
update model card README.md
062215f
metadata
license: mit
base_model: roberta-large
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: roberta-large-sst-2-64-13-smoothed
    results: []

roberta-large-sst-2-64-13-smoothed

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5741
  • Accuracy: 0.9375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 75
  • label_smoothing_factor: 0.45

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 4 0.6932 0.5
No log 2.0 8 0.6930 0.5
0.6986 3.0 12 0.6928 0.5078
0.6986 4.0 16 0.6926 0.5078
0.7049 5.0 20 0.6926 0.5
0.7049 6.0 24 0.6924 0.5
0.7049 7.0 28 0.6922 0.5
0.6928 8.0 32 0.6918 0.5234
0.6928 9.0 36 0.6912 0.5312
0.6889 10.0 40 0.6905 0.5625
0.6889 11.0 44 0.6895 0.5078
0.6889 12.0 48 0.6880 0.5781
0.6855 13.0 52 0.6823 0.6875
0.6855 14.0 56 0.6590 0.8281
0.6346 15.0 60 0.6187 0.8672
0.6346 16.0 64 0.6192 0.8281
0.6346 17.0 68 0.5983 0.9062
0.5877 18.0 72 0.6030 0.875
0.5877 19.0 76 0.5942 0.9141
0.564 20.0 80 0.5918 0.8984
0.564 21.0 84 0.5860 0.9141
0.564 22.0 88 0.5761 0.9375
0.5505 23.0 92 0.5854 0.9297
0.5505 24.0 96 0.5750 0.9141
0.5462 25.0 100 0.5776 0.9141
0.5462 26.0 104 0.5713 0.9453
0.5462 27.0 108 0.5731 0.9375
0.5414 28.0 112 0.5770 0.9297
0.5414 29.0 116 0.5789 0.9141
0.5382 30.0 120 0.5871 0.9062
0.5382 31.0 124 0.5810 0.9141
0.5382 32.0 128 0.5765 0.9297
0.5383 33.0 132 0.5769 0.9297
0.5383 34.0 136 0.5718 0.9453
0.5385 35.0 140 0.5704 0.9453
0.5385 36.0 144 0.5728 0.9453
0.5385 37.0 148 0.5737 0.9297
0.5381 38.0 152 0.5749 0.9375
0.5381 39.0 156 0.5754 0.9375
0.5389 40.0 160 0.5742 0.9375
0.5389 41.0 164 0.5723 0.9375
0.5389 42.0 168 0.5720 0.9375
0.5372 43.0 172 0.5694 0.9453
0.5372 44.0 176 0.5723 0.9375
0.5384 45.0 180 0.5766 0.9375
0.5384 46.0 184 0.5715 0.9375
0.5384 47.0 188 0.5696 0.9453
0.5379 48.0 192 0.5709 0.9453
0.5379 49.0 196 0.5720 0.9453
0.5372 50.0 200 0.5717 0.9453
0.5372 51.0 204 0.5706 0.9453
0.5372 52.0 208 0.5697 0.9453
0.5371 53.0 212 0.5700 0.9453
0.5371 54.0 216 0.5706 0.9453
0.5368 55.0 220 0.5697 0.9453
0.5368 56.0 224 0.5702 0.9453
0.5368 57.0 228 0.5719 0.9453
0.5371 58.0 232 0.5728 0.9453
0.5371 59.0 236 0.5729 0.9375
0.5371 60.0 240 0.5734 0.9375
0.5371 61.0 244 0.5736 0.9375
0.5371 62.0 248 0.5745 0.9375
0.5369 63.0 252 0.5760 0.9375
0.5369 64.0 256 0.5772 0.9375
0.5365 65.0 260 0.5771 0.9375
0.5365 66.0 264 0.5763 0.9375
0.5365 67.0 268 0.5759 0.9375
0.5365 68.0 272 0.5753 0.9375
0.5365 69.0 276 0.5751 0.9375
0.5369 70.0 280 0.5746 0.9375
0.5369 71.0 284 0.5741 0.9375
0.5369 72.0 288 0.5742 0.9375
0.5367 73.0 292 0.5742 0.9375
0.5367 74.0 296 0.5741 0.9375
0.5368 75.0 300 0.5741 0.9375

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3