Edit model card

modernisa-v2-byt5-base-lr0.0001

This model is a fine-tuned version of google/byt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4744
  • Bleu: 30.8745
  • Wer: 47.8194
  • Cer: 34.4895
  • Gen Len: 18.5499

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Bleu Wer Cer Gen Len
0.2696 0.09 1000 0.3027 27.8571 49.5134 34.4149 18.5
0.2518 0.17 2000 0.2857 29.2213 49.1981 34.6336 18.5371
0.2343 0.26 3000 0.2730 29.5067 49.117 34.9795 18.5537
0.2292 0.35 4000 0.2690 29.884 48.7025 34.8015 18.5516
0.2243 0.44 5000 0.2647 29.9577 48.8466 34.7218 18.5477
0.2112 0.52 6000 0.2636 30.3115 48.3871 34.4895 18.5477
0.2118 0.61 7000 0.2555 30.6364 48.3961 34.7455 18.5413
0.205 0.7 8000 0.2508 31.0881 47.468 34.0759 18.5269
0.2049 0.78 9000 0.2471 31.1481 47.5942 34.4133 18.5503
0.2005 0.87 10000 0.2468 30.9375 47.6392 34.281 18.5405
0.1999 0.96 11000 0.2431 30.9692 47.7023 34.4183 18.5405
0.161 1.04 12000 0.2491 31.2337 47.3238 34.1878 18.5298
0.1601 1.13 13000 0.2496 31.4422 47.3689 34.1657 18.5371
0.1606 1.22 14000 0.2459 31.4582 47.3329 34.2386 18.5405
0.1594 1.31 15000 0.2466 31.386 47.1166 34.2912 18.5375
0.1617 1.39 16000 0.2412 31.6546 46.8373 34.0149 18.5294
0.1582 1.48 17000 0.2461 31.2924 47.4139 34.2573 18.5503
0.1572 1.57 18000 0.2425 31.1484 47.45 34.3675 18.5499
0.1565 1.65 19000 0.2424 31.6967 46.9724 34.1047 18.5388
0.1585 1.74 20000 0.2382 31.9026 47.0175 34.281 18.558
0.1522 1.83 21000 0.2365 32.1619 46.5219 33.9369 18.5311
0.156 1.92 22000 0.2381 31.7762 46.7922 33.9572 18.5401
0.1538 2.0 23000 0.2402 31.8785 46.8012 34.2319 18.5516
0.1083 2.09 24000 0.2654 31.9905 46.603 34.0098 18.5384
0.1086 2.18 25000 0.2618 31.6257 46.9995 34.2607 18.5409
0.1092 2.26 26000 0.2658 31.4886 47.1436 34.337 18.5422
0.1086 2.35 27000 0.2666 31.8448 46.6751 34.1217 18.5375
0.1098 2.44 28000 0.2659 31.709 46.8913 34.1946 18.5452
0.1117 2.52 29000 0.2649 31.8114 46.8913 34.1708 18.5431
0.1094 2.61 30000 0.2656 31.6955 46.8643 34.1606 18.5375
0.1077 2.7 31000 0.2637 31.5495 46.8823 34.0064 18.5448
0.1088 2.79 32000 0.2669 32.0837 46.612 33.9504 18.5413
0.1087 2.87 33000 0.2646 31.5549 47.0806 34.2149 18.5286
0.1077 2.96 34000 0.2630 32.1129 46.4318 33.9403 18.5452
0.0652 3.05 35000 0.3360 31.3861 47.1977 34.1149 18.5396
0.0662 3.13 36000 0.3401 31.2372 47.3869 34.203 18.552
0.0666 3.22 37000 0.3389 31.3462 47.2968 34.1759 18.5469
0.0648 3.31 38000 0.3339 30.835 47.6753 34.381 18.552
0.0654 3.4 39000 0.3395 31.0958 47.7203 34.4692 18.5524
0.0663 3.48 40000 0.3318 31.126 47.5942 34.4539 18.5499
0.0648 3.57 41000 0.3397 31.0295 47.5852 34.3539 18.5477
0.0635 3.66 42000 0.3414 31.1287 47.5491 34.4285 18.5494
0.0656 3.74 43000 0.3394 30.9225 47.6392 34.4285 18.5563
0.0625 3.83 44000 0.3420 31.2435 47.2968 34.1674 18.5439
0.0636 3.92 45000 0.3448 31.0688 47.6843 34.3743 18.5439
0.0586 4.0 46000 0.3675 31.2353 47.441 34.2963 18.549
0.0298 4.09 47000 0.4566 30.698 47.8555 34.4319 18.5512
0.0301 4.18 48000 0.4724 30.7773 47.8374 34.3861 18.5507
0.0311 4.27 49000 0.4640 31.0878 47.6212 34.3861 18.5503
0.03 4.35 50000 0.4654 30.8319 47.8915 34.459 18.5529
0.0302 4.44 51000 0.4665 30.9236 47.9276 34.4997 18.552
0.029 4.53 52000 0.4757 30.8307 47.9456 34.4997 18.5482
0.0301 4.61 53000 0.4672 30.7983 47.9456 34.5218 18.5473
0.0294 4.7 54000 0.4715 30.8924 47.7564 34.4353 18.5529
0.0288 4.79 55000 0.4752 30.7372 47.7924 34.4675 18.5524
0.0289 4.88 56000 0.4744 30.8554 47.8555 34.459 18.5516
0.0288 4.96 57000 0.4744 30.8745 47.8194 34.4895 18.5499

Framework versions

  • Transformers 4.30.0.dev0
  • Pytorch 1.13.0+cu117
  • Datasets 2.12.0
  • Tokenizers 0.11.0
Downloads last month
7
Safetensors
Model size
582M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for versae/modernisa-v2-byt5-base-lr0.0001

Base model

google/byt5-base
Finetuned
(5)
this model

Space using versae/modernisa-v2-byt5-base-lr0.0001 1