git-base-bdd100k / README.md
Trkkk's picture
End of training
6787816 verified
|
raw
history blame
7.32 kB
metadata
library_name: transformers
license: mit
base_model: microsoft/git-base
tags:
  - generated_from_trainer
model-index:
  - name: git-base-bdd100k
    results: []

git-base-bdd100k

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4505
  • Wer Score: 2.0146

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 20
  • eval_batch_size: 20
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 40
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
10.7599 0.9091 5 9.0648 7.4053
7.8749 2.0 11 7.9846 5.4869
8.5483 2.9091 16 7.3958 5.9978
6.5899 4.0 22 6.8122 7.3117
7.3362 4.9091 27 6.3566 5.4206
5.6682 6.0 33 5.8240 2.6977
6.2659 6.9091 38 5.3805 2.3248
4.7822 8.0 44 4.8517 2.4497
5.2042 8.9091 49 4.4194 2.3350
3.9022 10.0 55 3.9023 2.0637
4.1582 10.9091 60 3.4813 2.4832
3.0413 12.0 66 2.9854 2.5313
3.1438 12.9091 71 2.5871 2.4395
2.2196 14.0 77 2.1313 2.5160
2.199 14.9091 82 1.7799 2.4064
1.4819 16.0 88 1.4052 2.3929
1.3977 16.9091 93 1.1385 2.4009
0.9006 18.0 99 0.8846 2.3711
0.8222 18.9091 104 0.7261 2.5171
0.5272 20.0 110 0.5892 2.5583
0.4908 20.9091 115 0.5160 2.5098
0.3346 22.0 121 0.4587 2.3434
0.3306 22.9091 126 0.4197 2.3015
0.2313 24.0 132 0.3966 2.0754
0.237 24.9091 137 0.3828 2.2418
0.1691 26.0 143 0.3792 1.7196
0.1745 26.9091 148 0.3729 2.2782
0.1261 28.0 154 0.3665 1.8682
0.1294 28.9091 159 0.3745 1.8237
0.0916 30.0 165 0.3762 2.3332
0.0944 30.9091 170 0.3758 1.9060
0.0682 32.0 176 0.3796 2.1471
0.0703 32.9091 181 0.3846 1.8350
0.0512 34.0 187 0.3891 2.0670
0.0537 34.9091 192 0.3909 2.0998
0.0392 36.0 198 0.3944 2.2658
0.0418 36.9091 203 0.3999 2.1865
0.0314 38.0 209 0.3970 2.2338
0.0344 38.9091 214 0.4057 2.0838
0.0252 40.0 220 0.4073 2.2542
0.0285 40.9091 225 0.4079 2.2538
0.022 42.0 231 0.4121 2.0579
0.0237 42.9091 236 0.4097 2.1475
0.0182 44.0 242 0.4185 2.1577
0.0203 44.9091 247 0.4151 2.2378
0.0157 46.0 253 0.4212 2.0703
0.0177 46.9091 258 0.4212 2.0237
0.0136 48.0 264 0.4208 1.9676
0.0155 48.9091 269 0.4229 2.0262
0.0123 50.0 275 0.4253 2.0612
0.0144 50.9091 280 0.4284 2.0663
0.0112 52.0 286 0.4315 2.0706
0.0129 52.9091 291 0.4301 2.0568
0.0107 54.0 297 0.4301 2.0087
0.0121 54.9091 302 0.4311 2.0022
0.0095 56.0 308 0.4313 1.9996
0.0109 56.9091 313 0.4333 2.0546
0.0086 58.0 319 0.4338 2.0787
0.0102 58.9091 324 0.4359 2.0091
0.0082 60.0 330 0.4369 2.0430
0.0095 60.9091 335 0.4366 1.9592
0.0076 62.0 341 0.4388 1.9905
0.0089 62.9091 346 0.4395 2.0295
0.0072 64.0 352 0.4404 2.0200
0.0084 64.9091 357 0.4393 2.0641
0.0067 66.0 363 0.4408 2.0798
0.0078 66.9091 368 0.4422 2.0601
0.0063 68.0 374 0.4420 2.0408
0.0076 68.9091 379 0.4427 2.0273
0.0063 70.0 385 0.4438 2.0306
0.0072 70.9091 390 0.4436 2.0462
0.006 72.0 396 0.4456 2.0160
0.007 72.9091 401 0.4472 2.0382
0.0057 74.0 407 0.4466 2.0532
0.0066 74.9091 412 0.4459 2.0612
0.0055 76.0 418 0.4469 2.0229
0.0065 76.9091 423 0.4474 1.9632
0.0054 78.0 429 0.4481 1.9519
0.0064 78.9091 434 0.4475 1.9836
0.0052 80.0 440 0.4475 2.0149
0.0062 80.9091 445 0.4482 2.0197
0.0052 82.0 451 0.4490 2.0208
0.0061 82.9091 456 0.4496 2.0324
0.0049 84.0 462 0.4498 2.0240
0.006 84.9091 467 0.4496 2.0168
0.0049 86.0 473 0.4499 2.0
0.0059 86.9091 478 0.4505 1.9822
0.005 88.0 484 0.4506 1.9978
0.0058 88.9091 489 0.4505 2.0117
0.0049 90.0 495 0.4505 2.0135
0.0053 90.9091 500 0.4505 2.0146

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.1.1+cu121
  • Datasets 3.0.2
  • Tokenizers 0.20.1