simonycl's picture
update model card README.md
8776795
metadata
license: mit
base_model: roberta-large
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: roberta-large-sst-2-32-13
    results: []

roberta-large-sst-2-32-13

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4497
  • Accuracy: 0.9375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 150

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 2 0.6944 0.5
No log 2.0 4 0.6944 0.5
No log 3.0 6 0.6944 0.5
No log 4.0 8 0.6944 0.5
0.7018 5.0 10 0.6944 0.5
0.7018 6.0 12 0.6943 0.5
0.7018 7.0 14 0.6943 0.5
0.7018 8.0 16 0.6942 0.5
0.7018 9.0 18 0.6941 0.5
0.7003 10.0 20 0.6940 0.5
0.7003 11.0 22 0.6939 0.5
0.7003 12.0 24 0.6938 0.5
0.7003 13.0 26 0.6937 0.5
0.7003 14.0 28 0.6936 0.5
0.6964 15.0 30 0.6934 0.5
0.6964 16.0 32 0.6934 0.5
0.6964 17.0 34 0.6933 0.5
0.6964 18.0 36 0.6932 0.5
0.6964 19.0 38 0.6931 0.5
0.7001 20.0 40 0.6931 0.5
0.7001 21.0 42 0.6931 0.5
0.7001 22.0 44 0.6931 0.5
0.7001 23.0 46 0.6931 0.5
0.7001 24.0 48 0.6931 0.5
0.6924 25.0 50 0.6931 0.5
0.6924 26.0 52 0.6931 0.5
0.6924 27.0 54 0.6931 0.5
0.6924 28.0 56 0.6930 0.5
0.6924 29.0 58 0.6930 0.5
0.6985 30.0 60 0.6930 0.5
0.6985 31.0 62 0.6930 0.5
0.6985 32.0 64 0.6929 0.5
0.6985 33.0 66 0.6927 0.5
0.6985 34.0 68 0.6925 0.5
0.6968 35.0 70 0.6924 0.5
0.6968 36.0 72 0.6923 0.5
0.6968 37.0 74 0.6922 0.5
0.6968 38.0 76 0.6922 0.5
0.6968 39.0 78 0.6920 0.5
0.6822 40.0 80 0.6917 0.5
0.6822 41.0 82 0.6916 0.5
0.6822 42.0 84 0.6913 0.5
0.6822 43.0 86 0.6911 0.5
0.6822 44.0 88 0.6910 0.5
0.6907 45.0 90 0.6908 0.5
0.6907 46.0 92 0.6906 0.5
0.6907 47.0 94 0.6905 0.5
0.6907 48.0 96 0.6902 0.5156
0.6907 49.0 98 0.6898 0.5625
0.6822 50.0 100 0.6892 0.5469
0.6822 51.0 102 0.6887 0.5938
0.6822 52.0 104 0.6881 0.5938
0.6822 53.0 106 0.6874 0.6094
0.6822 54.0 108 0.6868 0.6094
0.6744 55.0 110 0.6862 0.5938
0.6744 56.0 112 0.6859 0.5312
0.6744 57.0 114 0.6856 0.5469
0.6744 58.0 116 0.6873 0.5469
0.6744 59.0 118 0.6910 0.5469
0.6401 60.0 120 0.6938 0.5469
0.6401 61.0 122 0.6911 0.5625
0.6401 62.0 124 0.6835 0.5625
0.6401 63.0 126 0.6765 0.5781
0.6401 64.0 128 0.6689 0.5781
0.5823 65.0 130 0.6597 0.6094
0.5823 66.0 132 0.6514 0.625
0.5823 67.0 134 0.6459 0.6406
0.5823 68.0 136 0.6372 0.6562
0.5823 69.0 138 0.6274 0.6562
0.5265 70.0 140 0.6163 0.6875
0.5265 71.0 142 0.6018 0.7188
0.5265 72.0 144 0.5853 0.7812
0.5265 73.0 146 0.5600 0.7812
0.5265 74.0 148 0.5138 0.8125
0.4305 75.0 150 0.4514 0.8594
0.4305 76.0 152 0.3753 0.9219
0.4305 77.0 154 0.3197 0.9375
0.4305 78.0 156 0.2687 0.9375
0.4305 79.0 158 0.2246 0.9531
0.2335 80.0 160 0.2019 0.9219
0.2335 81.0 162 0.1977 0.9219
0.2335 82.0 164 0.1741 0.9375
0.2335 83.0 166 0.1468 0.9375
0.2335 84.0 168 0.1355 0.9688
0.0918 85.0 170 0.1447 0.9688
0.0918 86.0 172 0.1628 0.9688
0.0918 87.0 174 0.2077 0.9531
0.0918 88.0 176 0.2623 0.9375
0.0918 89.0 178 0.2854 0.9375
0.0132 90.0 180 0.3076 0.9375
0.0132 91.0 182 0.2989 0.9375
0.0132 92.0 184 0.2839 0.9531
0.0132 93.0 186 0.2756 0.9531
0.0132 94.0 188 0.2669 0.9531
0.0035 95.0 190 0.2414 0.9531
0.0035 96.0 192 0.2353 0.9375
0.0035 97.0 194 0.2482 0.9531
0.0035 98.0 196 0.2578 0.9375
0.0035 99.0 198 0.2755 0.9375
0.0013 100.0 200 0.2956 0.9375
0.0013 101.0 202 0.3133 0.9531
0.0013 102.0 204 0.3293 0.9531
0.0013 103.0 206 0.3417 0.9531
0.0013 104.0 208 0.3510 0.9531
0.0005 105.0 210 0.3616 0.9531
0.0005 106.0 212 0.3694 0.9531
0.0005 107.0 214 0.3754 0.9531
0.0005 108.0 216 0.3806 0.9531
0.0005 109.0 218 0.3850 0.9531
0.0004 110.0 220 0.3890 0.9531
0.0004 111.0 222 0.3924 0.9531
0.0004 112.0 224 0.3956 0.9531
0.0004 113.0 226 0.3986 0.9531
0.0004 114.0 228 0.4011 0.9531
0.0003 115.0 230 0.4034 0.9531
0.0003 116.0 232 0.4056 0.9531
0.0003 117.0 234 0.4076 0.9531
0.0003 118.0 236 0.4118 0.9531
0.0003 119.0 238 0.4199 0.9531
0.0003 120.0 240 0.4298 0.9375
0.0003 121.0 242 0.4401 0.9375
0.0003 122.0 244 0.4495 0.9375
0.0003 123.0 246 0.4602 0.9375
0.0003 124.0 248 0.4687 0.9375
0.0003 125.0 250 0.4755 0.9375
0.0003 126.0 252 0.4813 0.9375
0.0003 127.0 254 0.4855 0.9375
0.0003 128.0 256 0.4896 0.9375
0.0003 129.0 258 0.4940 0.9375
0.0002 130.0 260 0.4967 0.9375
0.0002 131.0 262 0.4963 0.9375
0.0002 132.0 264 0.4903 0.9375
0.0002 133.0 266 0.4861 0.9375
0.0002 134.0 268 0.4831 0.9375
0.0003 135.0 270 0.4804 0.9375
0.0003 136.0 272 0.4780 0.9375
0.0003 137.0 274 0.4761 0.9375
0.0003 138.0 276 0.4721 0.9375
0.0003 139.0 278 0.4686 0.9375
0.0002 140.0 280 0.4646 0.9375
0.0002 141.0 282 0.4593 0.9375
0.0002 142.0 284 0.4542 0.9375
0.0002 143.0 286 0.4495 0.9375
0.0002 144.0 288 0.4472 0.9375
0.0002 145.0 290 0.4465 0.9375
0.0002 146.0 292 0.4467 0.9375
0.0002 147.0 294 0.4469 0.9375
0.0002 148.0 296 0.4474 0.9375
0.0002 149.0 298 0.4483 0.9375
0.0002 150.0 300 0.4497 0.9375

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3