Edit model card

bert-large-uncased-sst-2-64-13-smoothed

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6024
  • Accuracy: 0.8438

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 75
  • label_smoothing_factor: 0.45

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 4 0.8114 0.5078
No log 2.0 8 0.7930 0.5078
0.8117 3.0 12 0.7630 0.5
0.8117 4.0 16 0.7257 0.5078
0.7546 5.0 20 0.6872 0.5938
0.7546 6.0 24 0.6706 0.6875
0.7546 7.0 28 0.6589 0.7578
0.6762 8.0 32 0.6473 0.7734
0.6762 9.0 36 0.6369 0.7812
0.6014 10.0 40 0.6282 0.7969
0.6014 11.0 44 0.6232 0.8125
0.6014 12.0 48 0.6226 0.8281
0.5545 13.0 52 0.6205 0.8281
0.5545 14.0 56 0.6191 0.7969
0.5486 15.0 60 0.6288 0.8047
0.5486 16.0 64 0.6184 0.8438
0.5486 17.0 68 0.6241 0.8203
0.5451 18.0 72 0.6098 0.8438
0.5451 19.0 76 0.6090 0.875
0.5418 20.0 80 0.6094 0.8672
0.5418 21.0 84 0.6092 0.8594
0.5418 22.0 88 0.6102 0.8594
0.5414 23.0 92 0.6107 0.8594
0.5414 24.0 96 0.6106 0.8281
0.5394 25.0 100 0.6104 0.8359
0.5394 26.0 104 0.6107 0.8359
0.5394 27.0 108 0.6125 0.8359
0.539 28.0 112 0.6144 0.8359
0.539 29.0 116 0.6139 0.8359
0.5398 30.0 120 0.6149 0.8281
0.5398 31.0 124 0.6174 0.8438
0.5398 32.0 128 0.6216 0.8359
0.5387 33.0 132 0.6200 0.8359
0.5387 34.0 136 0.6151 0.8438
0.5396 35.0 140 0.6138 0.8438
0.5396 36.0 144 0.6140 0.8438
0.5396 37.0 148 0.6147 0.8281
0.5388 38.0 152 0.6111 0.8516
0.5388 39.0 156 0.6097 0.8516
0.5391 40.0 160 0.6088 0.8594
0.5391 41.0 164 0.6090 0.8438
0.5391 42.0 168 0.6109 0.8438
0.5388 43.0 172 0.6102 0.8438
0.5388 44.0 176 0.6088 0.8438
0.5385 45.0 180 0.6091 0.8438
0.5385 46.0 184 0.6127 0.8438
0.5385 47.0 188 0.6167 0.8203
0.5391 48.0 192 0.6143 0.8359
0.5391 49.0 196 0.6071 0.8516
0.5387 50.0 200 0.6061 0.8516
0.5387 51.0 204 0.6054 0.8438
0.5387 52.0 208 0.6037 0.8516
0.5385 53.0 212 0.6019 0.8516
0.5385 54.0 216 0.6008 0.8438
0.5379 55.0 220 0.5998 0.8516
0.5379 56.0 224 0.5992 0.8516
0.5379 57.0 228 0.6001 0.8516
0.5382 58.0 232 0.6026 0.8438
0.5382 59.0 236 0.6039 0.8438
0.5381 60.0 240 0.6043 0.8438
0.5381 61.0 244 0.6032 0.8438
0.5381 62.0 248 0.6030 0.8438
0.5389 63.0 252 0.6023 0.8438
0.5389 64.0 256 0.6019 0.8438
0.5378 65.0 260 0.6024 0.8438
0.5378 66.0 264 0.6025 0.8438
0.5378 67.0 268 0.6020 0.8438
0.5374 68.0 272 0.6016 0.8438
0.5374 69.0 276 0.6017 0.8438
0.5378 70.0 280 0.6023 0.8438
0.5378 71.0 284 0.6025 0.8438
0.5378 72.0 288 0.6024 0.8438
0.5372 73.0 292 0.6023 0.8438
0.5372 74.0 296 0.6024 0.8438
0.5377 75.0 300 0.6024 0.8438

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for simonycl/bert-large-uncased-sst-2-64-13-smoothed

Finetuned
(103)
this model