simonycl's picture
update model card README.md
563dd92
metadata
license: mit
base_model: roberta-base
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: roberta-base-sst-2-16-13-smoothed
    results: []

roberta-base-sst-2-16-13-smoothed

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5750
  • Accuracy: 0.9688

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 75
  • label_smoothing_factor: 0.45

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 1 0.6919 0.5
No log 2.0 2 0.6919 0.5
No log 3.0 3 0.6919 0.5
No log 4.0 4 0.6919 0.5
No log 5.0 5 0.6919 0.5
No log 6.0 6 0.6918 0.5
No log 7.0 7 0.6918 0.5
No log 8.0 8 0.6918 0.5
No log 9.0 9 0.6918 0.5
0.6949 10.0 10 0.6917 0.5
0.6949 11.0 11 0.6917 0.5
0.6949 12.0 12 0.6916 0.5
0.6949 13.0 13 0.6916 0.5
0.6949 14.0 14 0.6915 0.5
0.6949 15.0 15 0.6914 0.5
0.6949 16.0 16 0.6914 0.5312
0.6949 17.0 17 0.6913 0.5312
0.6949 18.0 18 0.6912 0.5312
0.6949 19.0 19 0.6911 0.625
0.6926 20.0 20 0.6910 0.625
0.6926 21.0 21 0.6909 0.6562
0.6926 22.0 22 0.6907 0.6875
0.6926 23.0 23 0.6906 0.6875
0.6926 24.0 24 0.6904 0.6875
0.6926 25.0 25 0.6902 0.75
0.6926 26.0 26 0.6899 0.75
0.6926 27.0 27 0.6896 0.75
0.6926 28.0 28 0.6893 0.7188
0.6926 29.0 29 0.6890 0.6875
0.687 30.0 30 0.6885 0.6875
0.687 31.0 31 0.6880 0.7188
0.687 32.0 32 0.6874 0.7188
0.687 33.0 33 0.6866 0.7188
0.687 34.0 34 0.6857 0.7188
0.687 35.0 35 0.6846 0.75
0.687 36.0 36 0.6832 0.75
0.687 37.0 37 0.6814 0.7812
0.687 38.0 38 0.6791 0.7812
0.687 39.0 39 0.6761 0.875
0.6732 40.0 40 0.6721 0.9062
0.6732 41.0 41 0.6670 0.9062
0.6732 42.0 42 0.6601 0.9062
0.6732 43.0 43 0.6510 0.875
0.6732 44.0 44 0.6392 0.875
0.6732 45.0 45 0.6248 0.875
0.6732 46.0 46 0.6098 0.875
0.6732 47.0 47 0.5961 0.875
0.6732 48.0 48 0.5884 0.9375
0.6732 49.0 49 0.5833 0.9375
0.5913 50.0 50 0.5795 0.9062
0.5913 51.0 51 0.5851 0.9062
0.5913 52.0 52 0.5985 0.875
0.5913 53.0 53 0.6110 0.8125
0.5913 54.0 54 0.6092 0.8438
0.5913 55.0 55 0.6007 0.8438
0.5913 56.0 56 0.5904 0.875
0.5913 57.0 57 0.5846 0.9062
0.5913 58.0 58 0.5829 0.9062
0.5913 59.0 59 0.5843 0.9062
0.544 60.0 60 0.5900 0.8438
0.544 61.0 61 0.5970 0.8438
0.544 62.0 62 0.6026 0.8438
0.544 63.0 63 0.6030 0.8438
0.544 64.0 64 0.5980 0.8438
0.544 65.0 65 0.5901 0.8438
0.544 66.0 66 0.5843 0.875
0.544 67.0 67 0.5800 0.9062
0.544 68.0 68 0.5779 0.9375
0.544 69.0 69 0.5765 0.9375
0.5383 70.0 70 0.5758 0.9688
0.5383 71.0 71 0.5754 0.9688
0.5383 72.0 72 0.5752 0.9688
0.5383 73.0 73 0.5751 0.9688
0.5383 74.0 74 0.5750 0.9688
0.5383 75.0 75 0.5750 0.9688

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3