Edit model card

git-base-instagram-captions

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8334
  • Wer Score: 13.4415

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
4.3503 0.3556 20 4.0231 13.3087
3.2554 0.7111 40 3.0955 14.7696
2.7452 1.0667 60 2.5707 11.9736
1.916 1.4222 80 2.4606 14.1495
2.0349 1.7778 100 2.4337 15.4433
1.7522 2.1333 120 2.4274 13.5215
1.5496 2.4889 140 2.4298 13.0932
1.554 2.8444 160 2.4232 14.4336
1.3561 3.2 180 2.4537 13.2498
1.2081 3.5556 200 2.4563 14.3843
1.1593 3.9111 220 2.4378 14.3703
1.0503 4.2667 240 2.4849 14.6614
1.0292 4.6222 260 2.4937 14.5066
1.092 4.9778 280 2.4853 14.2252
0.8704 5.3333 300 2.5228 13.2709
0.8933 5.6889 320 2.5363 14.3369
0.8494 6.0444 340 2.5329 16.8566
0.7886 6.4 360 2.5424 13.8083
0.8338 6.7556 380 2.5585 13.6931
0.7756 7.1111 400 2.5856 14.4222
0.6575 7.4667 420 2.5890 13.6350
0.6793 7.8222 440 2.5833 14.1258
0.6439 8.1778 460 2.6046 15.0616
0.597 8.5333 480 2.6183 13.7511
0.6157 8.8889 500 2.6166 15.7696
0.6132 9.2444 520 2.6466 17.8857
0.5666 9.6 540 2.6472 16.8452
0.5691 9.9556 560 2.6456 13.7449
0.529 10.3111 580 2.6676 14.5945
0.5286 10.6667 600 2.6727 13.7344
0.5339 11.0222 620 2.6706 14.9833
0.4967 11.3778 640 2.6928 13.9587
0.459 11.7333 660 2.7035 14.5708
0.5117 12.0889 680 2.7064 14.3026
0.4424 12.4444 700 2.7304 14.0836
0.4829 12.8 720 2.7175 14.7573
0.4308 13.1556 740 2.7129 14.5928
0.3871 13.5111 760 2.7333 14.1328
0.4238 13.8667 780 2.7319 14.1759
0.4141 14.2222 800 2.7605 14.0580
0.3435 14.5778 820 2.7604 13.6966
0.3807 14.9333 840 2.7634 13.7836
0.3293 15.2889 860 2.7802 13.9120
0.3244 15.6444 880 2.7760 13.5101
0.3289 16.0 900 2.7884 13.5409
0.2768 16.3556 920 2.7946 13.7801
0.2825 16.7111 940 2.7988 14.1662
0.3059 17.0667 960 2.8092 13.6667
0.2279 17.4222 980 2.8253 13.7704
0.2283 17.7778 1000 2.8236 13.7256
0.2293 18.1333 1020 2.8174 13.1741
0.1934 18.4889 1040 2.8220 13.5479
0.1927 18.8444 1060 2.8225 13.6324
0.1794 19.2 1080 2.8343 13.4195
0.169 19.5556 1100 2.8345 13.3967
0.1724 19.9111 1120 2.8334 13.4415

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
22
Safetensors
Model size
177M params
Tensor type
F32
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for Niharika1603/git-base-instagram-captions

Base model

microsoft/git-base
Finetuned
(103)
this model