t5-base-p-l-akk-en-20241125-151008
This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4584
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 200
- eval_batch_size: 200
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2000
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.9667 | 1.0 | 10362 | 0.9112 |
0.8355 | 2.0 | 20724 | 0.7909 |
0.772 | 3.0 | 31086 | 0.7263 |
0.7326 | 4.0 | 41448 | 0.6910 |
0.7033 | 5.0 | 51810 | 0.6666 |
0.6787 | 6.0 | 62172 | 0.6455 |
0.6633 | 7.0 | 72534 | 0.6329 |
0.652 | 8.0 | 82896 | 0.6206 |
0.6408 | 9.0 | 93258 | 0.6073 |
0.6315 | 10.0 | 103620 | 0.6015 |
0.6161 | 11.0 | 113982 | 0.5914 |
0.6211 | 12.0 | 124344 | 0.5857 |
0.6053 | 13.0 | 134706 | 0.5766 |
0.6043 | 14.0 | 145068 | 0.5727 |
0.5954 | 15.0 | 155430 | 0.5681 |
0.59 | 16.0 | 165792 | 0.5649 |
0.5844 | 17.0 | 176154 | 0.5628 |
0.579 | 18.0 | 186516 | 0.5564 |
0.5792 | 19.0 | 196878 | 0.5493 |
0.5739 | 20.0 | 207240 | 0.5479 |
0.567 | 21.0 | 217602 | 0.5435 |
0.5626 | 22.0 | 227964 | 0.5406 |
0.5591 | 23.0 | 238326 | 0.5375 |
0.5508 | 24.0 | 248688 | 0.5356 |
0.5548 | 25.0 | 259050 | 0.5329 |
0.5512 | 26.0 | 269412 | 0.5299 |
0.5473 | 27.0 | 279774 | 0.5267 |
0.5413 | 28.0 | 290136 | 0.5243 |
0.5433 | 29.0 | 300498 | 0.5246 |
0.5378 | 30.0 | 310860 | 0.5209 |
0.5375 | 31.0 | 321222 | 0.5206 |
0.5363 | 32.0 | 331584 | 0.5178 |
0.528 | 33.0 | 341946 | 0.5143 |
0.532 | 34.0 | 352308 | 0.5121 |
0.5279 | 35.0 | 362670 | 0.5137 |
0.5265 | 36.0 | 373032 | 0.5080 |
0.5231 | 37.0 | 383394 | 0.5077 |
0.5187 | 38.0 | 393756 | 0.5082 |
0.5191 | 39.0 | 404118 | 0.5047 |
0.5159 | 40.0 | 414480 | 0.5029 |
0.5159 | 41.0 | 424842 | 0.5014 |
0.5131 | 42.0 | 435204 | 0.4998 |
0.5137 | 43.0 | 445566 | 0.4973 |
0.5128 | 44.0 | 455928 | 0.4972 |
0.5101 | 45.0 | 466290 | 0.4985 |
0.505 | 46.0 | 476652 | 0.4969 |
0.5014 | 47.0 | 487014 | 0.4964 |
0.4988 | 48.0 | 497376 | 0.4938 |
0.5051 | 49.0 | 507738 | 0.4898 |
0.4974 | 50.0 | 518100 | 0.4928 |
0.4999 | 51.0 | 528462 | 0.4904 |
0.4973 | 52.0 | 538824 | 0.4884 |
0.4973 | 53.0 | 549186 | 0.4877 |
0.4913 | 54.0 | 559548 | 0.4879 |
0.4968 | 55.0 | 569910 | 0.4846 |
0.4916 | 56.0 | 580272 | 0.4838 |
0.4938 | 57.0 | 590634 | 0.4833 |
0.4866 | 58.0 | 600996 | 0.4819 |
0.4871 | 59.0 | 611358 | 0.4818 |
0.4837 | 60.0 | 621720 | 0.4792 |
0.4855 | 61.0 | 632082 | 0.4783 |
0.4828 | 62.0 | 642444 | 0.4781 |
0.4789 | 63.0 | 652806 | 0.4780 |
0.4781 | 64.0 | 663168 | 0.4785 |
0.4803 | 65.0 | 673530 | 0.4767 |
0.4791 | 66.0 | 683892 | 0.4755 |
0.4783 | 67.0 | 694254 | 0.4743 |
0.4772 | 68.0 | 704616 | 0.4739 |
0.4757 | 69.0 | 714978 | 0.4730 |
0.4708 | 70.0 | 725340 | 0.4711 |
0.4698 | 71.0 | 735702 | 0.4717 |
0.4719 | 72.0 | 746064 | 0.4733 |
0.4708 | 73.0 | 756426 | 0.4703 |
0.4717 | 74.0 | 766788 | 0.4700 |
0.4714 | 75.0 | 777150 | 0.4677 |
0.4641 | 76.0 | 787512 | 0.4688 |
0.4642 | 77.0 | 797874 | 0.4678 |
0.4656 | 78.0 | 808236 | 0.4666 |
0.4625 | 79.0 | 818598 | 0.4661 |
0.4623 | 80.0 | 828960 | 0.4664 |
0.4619 | 81.0 | 839322 | 0.4657 |
0.4574 | 82.0 | 849684 | 0.4635 |
0.4562 | 83.0 | 860046 | 0.4628 |
0.4593 | 84.0 | 870408 | 0.4613 |
0.4583 | 85.0 | 880770 | 0.4600 |
0.4573 | 86.0 | 891132 | 0.4598 |
0.4518 | 87.0 | 901494 | 0.4564 |
0.4599 | 88.0 | 911856 | 0.4577 |
0.4545 | 89.0 | 922218 | 0.4594 |
0.4534 | 90.0 | 932580 | 0.4564 |
0.449 | 91.0 | 942942 | 0.4564 |
0.4523 | 92.0 | 953304 | 0.4584 |
Framework versions
- Transformers 4.45.2
- Pytorch 2.6.0.dev20241022+cu124
- Datasets 3.0.1
- Tokenizers 0.20.1
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.