moanlb's picture
End of training
a018db7 verified
|
raw
history blame
22.4 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: t5-small-finetuned-Informal_Text-to-Formal_Text
    results: []

t5-small-finetuned-Informal_Text-to-Formal_Text

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1514
  • Bleu: 0.4495
  • Gen Len: 16.2667

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 300
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 3 3.5743 0.1145 16.6
No log 2.0 6 3.5283 0.1145 16.6
No log 3.0 9 3.4859 0.1175 16.6
No log 4.0 12 3.4466 0.1175 16.6
No log 5.0 15 3.4214 0.1175 16.6
No log 6.0 18 3.3876 0.1175 16.6
No log 7.0 21 3.3581 0.1175 16.6
No log 8.0 24 3.3310 0.1129 17.2667
No log 9.0 27 3.3066 0.12 17.2667
No log 10.0 30 3.2823 0.1505 17.2667
No log 11.0 33 3.2576 0.1705 16.8667
No log 12.0 36 3.2336 0.1705 16.8667
No log 13.0 39 3.2109 0.1705 16.8667
No log 14.0 42 3.1902 0.1897 16.9333
No log 15.0 45 3.1704 0.1897 16.9333
No log 16.0 48 3.1519 0.1385 16.1333
No log 17.0 51 3.1350 0.1385 16.1333
No log 18.0 54 3.1178 0.1385 16.1333
No log 19.0 57 3.1011 0.1385 16.1333
No log 20.0 60 3.0855 0.1385 16.1333
No log 21.0 63 3.0693 0.1177 16.1333
No log 22.0 66 3.0527 0.1177 16.1333
No log 23.0 69 3.0359 0.1232 15.5333
No log 24.0 72 3.0191 0.0854 15.5333
No log 25.0 75 3.0031 0.0854 15.5333
No log 26.0 78 2.9889 0.0854 15.5333
No log 27.0 81 2.9742 0.1027 15.5333
No log 28.0 84 2.9586 0.1148 15.5333
No log 29.0 87 2.9438 0.1148 15.5333
No log 30.0 90 2.9302 0.1148 15.5333
No log 31.0 93 2.9169 0.0876 15.5333
No log 32.0 96 2.9043 0.0876 15.5333
No log 33.0 99 2.8911 0.0885 15.5333
No log 34.0 102 2.8775 0.0885 15.5333
No log 35.0 105 2.8648 0.1275 16.3333
No log 36.0 108 2.8530 0.1736 16.3333
No log 37.0 111 2.8417 0.172 16.3333
No log 38.0 114 2.8300 0.1671 16.3333
No log 39.0 117 2.8178 0.1671 16.3333
No log 40.0 120 2.8065 0.1671 16.3333
No log 41.0 123 2.7955 0.1671 16.3333
No log 42.0 126 2.7849 0.2144 16.3333
No log 43.0 129 2.7741 0.2287 16.3333
No log 44.0 132 2.7643 0.2287 16.3333
No log 45.0 135 2.7545 0.2287 16.3333
No log 46.0 138 2.7456 0.2287 16.3333
No log 47.0 141 2.7370 0.2547 16.3333
No log 48.0 144 2.7284 0.2476 16.3333
No log 49.0 147 2.7204 0.2493 16.3333
No log 50.0 150 2.7122 0.3029 15.8
No log 51.0 153 2.7035 0.3117 15.8
No log 52.0 156 2.6946 0.3117 15.8
No log 53.0 159 2.6857 0.3245 15.8
No log 54.0 162 2.6773 0.3245 15.8
No log 55.0 165 2.6701 0.3245 15.8
No log 56.0 168 2.6620 0.3726 16.3333
No log 57.0 171 2.6551 0.3755 16.3333
No log 58.0 174 2.6480 0.3755 16.3333
No log 59.0 177 2.6419 0.3755 16.3333
No log 60.0 180 2.6358 0.3755 16.3333
No log 61.0 183 2.6290 0.4711 17.0667
No log 62.0 186 2.6217 0.4701 16.8
No log 63.0 189 2.6150 0.4701 16.8
No log 64.0 192 2.6076 0.4701 16.8
No log 65.0 195 2.6009 0.5002 17.0667
No log 66.0 198 2.5941 0.4558 16.8667
No log 67.0 201 2.5881 0.4586 16.8667
No log 68.0 204 2.5820 0.4441 16.8667
No log 69.0 207 2.5777 0.4441 16.8667
No log 70.0 210 2.5732 0.4441 16.8667
No log 71.0 213 2.5664 0.4441 16.8667
No log 72.0 216 2.5602 0.487 17.0667
No log 73.0 219 2.5539 0.487 17.0667
No log 74.0 222 2.5477 0.487 17.0667
No log 75.0 225 2.5413 0.487 17.0667
No log 76.0 228 2.5356 0.4581 16.8
No log 77.0 231 2.5288 0.4792 17.0
No log 78.0 234 2.5237 0.4441 16.8667
No log 79.0 237 2.5180 0.3405 16.8667
No log 80.0 240 2.5115 0.3405 16.8667
No log 81.0 243 2.5055 0.3405 16.8667
No log 82.0 246 2.4995 0.3405 16.8667
No log 83.0 249 2.4940 0.3405 16.8667
No log 84.0 252 2.4895 0.3405 16.8667
No log 85.0 255 2.4859 0.5174 16.8667
No log 86.0 258 2.4817 0.5185 16.8667
No log 87.0 261 2.4772 0.5185 16.8667
No log 88.0 264 2.4735 0.5185 16.8667
No log 89.0 267 2.4698 0.5185 16.8667
No log 90.0 270 2.4658 0.5185 16.8667
No log 91.0 273 2.4615 0.5185 16.8667
No log 92.0 276 2.4573 0.5224 16.8667
No log 93.0 279 2.4524 0.3979 16.8667
No log 94.0 282 2.4477 0.3979 16.8667
No log 95.0 285 2.4418 0.3979 16.8667
No log 96.0 288 2.4367 0.3971 16.8667
No log 97.0 291 2.4320 0.3878 16.8667
No log 98.0 294 2.4285 0.3878 16.8667
No log 99.0 297 2.4254 0.513 16.8667
No log 100.0 300 2.4213 0.513 16.8667
No log 101.0 303 2.4163 0.5002 16.8667
No log 102.0 306 2.4118 0.5002 16.8667
No log 103.0 309 2.4075 0.4991 16.8667
No log 104.0 312 2.4036 0.4991 16.8667
No log 105.0 315 2.3989 0.4991 16.8667
No log 106.0 318 2.3945 0.4991 16.8667
No log 107.0 321 2.3919 0.4991 16.8667
No log 108.0 324 2.3884 0.4991 16.8667
No log 109.0 327 2.3853 0.4991 16.8667
No log 110.0 330 2.3818 0.4991 16.8667
No log 111.0 333 2.3781 0.4721 16.8667
No log 112.0 336 2.3748 0.4721 16.8667
No log 113.0 339 2.3718 0.4721 16.8667
No log 114.0 342 2.3688 0.4721 16.8667
No log 115.0 345 2.3656 0.4721 16.8667
No log 116.0 348 2.3619 0.4781 16.9333
No log 117.0 351 2.3589 0.507 16.9333
No log 118.0 354 2.3559 0.5092 16.9333
No log 119.0 357 2.3521 0.5092 16.9333
No log 120.0 360 2.3495 0.4745 16.6
No log 121.0 363 2.3462 0.4745 16.6
No log 122.0 366 2.3432 0.4745 16.6
No log 123.0 369 2.3398 0.4833 16.5333
No log 124.0 372 2.3375 0.4833 16.5333
No log 125.0 375 2.3348 0.4833 16.5333
No log 126.0 378 2.3320 0.4853 16.5333
No log 127.0 381 2.3292 0.4739 16.5333
No log 128.0 384 2.3260 0.4707 16.4
No log 129.0 387 2.3235 0.4596 16.4
No log 130.0 390 2.3207 0.4596 16.4
No log 131.0 393 2.3185 0.4596 16.4
No log 132.0 396 2.3160 0.4596 16.4
No log 133.0 399 2.3133 0.4357 16.2
No log 134.0 402 2.3108 0.4357 16.2
No log 135.0 405 2.3084 0.4357 16.2
No log 136.0 408 2.3062 0.4357 16.2
No log 137.0 411 2.3048 0.4357 16.2
No log 138.0 414 2.3029 0.4357 16.2
No log 139.0 417 2.3002 0.4357 16.2
No log 140.0 420 2.2969 0.4357 16.2
No log 141.0 423 2.2941 0.4357 16.4
No log 142.0 426 2.2911 0.4357 16.4
No log 143.0 429 2.2889 0.4357 16.4
No log 144.0 432 2.2870 0.4357 16.4
No log 145.0 435 2.2850 0.4357 16.4
No log 146.0 438 2.2829 0.4357 16.4
No log 147.0 441 2.2802 0.4357 16.4
No log 148.0 444 2.2778 0.4357 16.4
No log 149.0 447 2.2760 0.4357 16.4
No log 150.0 450 2.2744 0.4357 16.4
No log 151.0 453 2.2723 0.4357 16.4
No log 152.0 456 2.2701 0.4571 16.5333
No log 153.0 459 2.2672 0.4571 16.5333
No log 154.0 462 2.2658 0.4571 16.5333
No log 155.0 465 2.2636 0.4571 16.5333
No log 156.0 468 2.2624 0.4571 16.5333
No log 157.0 471 2.2608 0.4571 16.5333
No log 158.0 474 2.2589 0.4571 16.5333
No log 159.0 477 2.2575 0.4571 16.5333
No log 160.0 480 2.2555 0.4571 16.5333
No log 161.0 483 2.2535 0.4571 16.5333
No log 162.0 486 2.2514 0.4571 16.5333
No log 163.0 489 2.2497 0.4571 16.5333
No log 164.0 492 2.2480 0.4379 16.4
No log 165.0 495 2.2461 0.4379 16.4
No log 166.0 498 2.2444 0.4379 16.4
2.3355 167.0 501 2.2431 0.4379 16.4
2.3355 168.0 504 2.2417 0.4339 16.4
2.3355 169.0 507 2.2402 0.4339 16.4
2.3355 170.0 510 2.2392 0.4339 16.4
2.3355 171.0 513 2.2386 0.4339 16.4
2.3355 172.0 516 2.2375 0.4339 16.4
2.3355 173.0 519 2.2357 0.4339 16.4
2.3355 174.0 522 2.2338 0.4339 16.4
2.3355 175.0 525 2.2322 0.4339 16.4
2.3355 176.0 528 2.2302 0.4348 16.4
2.3355 177.0 531 2.2286 0.4348 16.4
2.3355 178.0 534 2.2275 0.4339 16.4
2.3355 179.0 537 2.2257 0.4339 16.4
2.3355 180.0 540 2.2242 0.4339 16.4
2.3355 181.0 543 2.2230 0.4339 16.4
2.3355 182.0 546 2.2218 0.4339 16.4
2.3355 183.0 549 2.2194 0.4348 16.4
2.3355 184.0 552 2.2173 0.4348 16.4
2.3355 185.0 555 2.2154 0.4348 16.4
2.3355 186.0 558 2.2139 0.4348 16.4
2.3355 187.0 561 2.2124 0.4348 16.4
2.3355 188.0 564 2.2111 0.4348 16.4
2.3355 189.0 567 2.2101 0.4348 16.4
2.3355 190.0 570 2.2088 0.4357 16.4
2.3355 191.0 573 2.2088 0.4348 16.4
2.3355 192.0 576 2.2078 0.4597 16.4
2.3355 193.0 579 2.2067 0.4597 16.4
2.3355 194.0 582 2.2051 0.4597 16.4
2.3355 195.0 585 2.2037 0.4597 16.4
2.3355 196.0 588 2.2026 0.4597 16.4
2.3355 197.0 591 2.2019 0.4597 16.4
2.3355 198.0 594 2.2008 0.4597 16.4
2.3355 199.0 597 2.1999 0.4514 16.4
2.3355 200.0 600 2.1983 0.4524 16.4
2.3355 201.0 603 2.1969 0.4524 16.4
2.3355 202.0 606 2.1950 0.4524 16.4
2.3355 203.0 609 2.1934 0.4524 16.4
2.3355 204.0 612 2.1922 0.4524 16.4
2.3355 205.0 615 2.1911 0.4524 16.4
2.3355 206.0 618 2.1900 0.4524 16.4
2.3355 207.0 621 2.1888 0.4524 16.4
2.3355 208.0 624 2.1878 0.4524 16.4
2.3355 209.0 627 2.1869 0.4524 16.4
2.3355 210.0 630 2.1862 0.4524 16.4
2.3355 211.0 633 2.1854 0.4524 16.4
2.3355 212.0 636 2.1844 0.4524 16.4
2.3355 213.0 639 2.1839 0.4473 16.4
2.3355 214.0 642 2.1828 0.4473 16.4
2.3355 215.0 645 2.1818 0.4473 16.4
2.3355 216.0 648 2.1805 0.4473 16.4
2.3355 217.0 651 2.1796 0.4473 16.4
2.3355 218.0 654 2.1788 0.4473 16.4
2.3355 219.0 657 2.1782 0.4473 16.4
2.3355 220.0 660 2.1774 0.4473 16.4
2.3355 221.0 663 2.1769 0.4473 16.4
2.3355 222.0 666 2.1766 0.4473 16.4
2.3355 223.0 669 2.1761 0.4473 16.4
2.3355 224.0 672 2.1757 0.4473 16.4
2.3355 225.0 675 2.1751 0.4473 16.4
2.3355 226.0 678 2.1746 0.4473 16.4
2.3355 227.0 681 2.1739 0.4473 16.4
2.3355 228.0 684 2.1735 0.4473 16.4
2.3355 229.0 687 2.1735 0.4473 16.4
2.3355 230.0 690 2.1729 0.4473 16.4
2.3355 231.0 693 2.1727 0.4473 16.4
2.3355 232.0 696 2.1717 0.4473 16.4
2.3355 233.0 699 2.1717 0.4473 16.4
2.3355 234.0 702 2.1711 0.4473 16.4
2.3355 235.0 705 2.1705 0.4473 16.4
2.3355 236.0 708 2.1699 0.4473 16.4
2.3355 237.0 711 2.1692 0.441 16.3333
2.3355 238.0 714 2.1688 0.441 16.3333
2.3355 239.0 717 2.1682 0.441 16.3333
2.3355 240.0 720 2.1677 0.441 16.3333
2.3355 241.0 723 2.1680 0.4382 16.4
2.3355 242.0 726 2.1669 0.441 16.2667
2.3355 243.0 729 2.1659 0.441 16.2667
2.3355 244.0 732 2.1651 0.441 16.2667
2.3355 245.0 735 2.1646 0.441 16.2667
2.3355 246.0 738 2.1640 0.441 16.2667
2.3355 247.0 741 2.1635 0.441 16.2667
2.3355 248.0 744 2.1631 0.441 16.2667
2.3355 249.0 747 2.1628 0.441 16.2667
2.3355 250.0 750 2.1622 0.441 16.2667
2.3355 251.0 753 2.1618 0.441 16.2667
2.3355 252.0 756 2.1612 0.441 16.2667
2.3355 253.0 759 2.1608 0.441 16.2667
2.3355 254.0 762 2.1605 0.441 16.2667
2.3355 255.0 765 2.1603 0.441 16.2667
2.3355 256.0 768 2.1600 0.441 16.2667
2.3355 257.0 771 2.1597 0.441 16.2667
2.3355 258.0 774 2.1597 0.441 16.2667
2.3355 259.0 777 2.1596 0.441 16.2667
2.3355 260.0 780 2.1594 0.441 16.2667
2.3355 261.0 783 2.1591 0.441 16.2667
2.3355 262.0 786 2.1586 0.441 16.2667
2.3355 263.0 789 2.1581 0.441 16.2667
2.3355 264.0 792 2.1578 0.441 16.2667
2.3355 265.0 795 2.1574 0.441 16.2667
2.3355 266.0 798 2.1571 0.441 16.2667
2.3355 267.0 801 2.1568 0.4495 16.2667
2.3355 268.0 804 2.1565 0.4495 16.2667
2.3355 269.0 807 2.1562 0.4495 16.2667
2.3355 270.0 810 2.1558 0.4495 16.2667
2.3355 271.0 813 2.1555 0.4495 16.2667
2.3355 272.0 816 2.1554 0.4495 16.2667
2.3355 273.0 819 2.1551 0.4495 16.2667
2.3355 274.0 822 2.1549 0.4495 16.2667
2.3355 275.0 825 2.1547 0.4495 16.2667
2.3355 276.0 828 2.1544 0.4495 16.2667
2.3355 277.0 831 2.1541 0.4495 16.2667
2.3355 278.0 834 2.1537 0.4495 16.2667
2.3355 279.0 837 2.1534 0.4495 16.2667
2.3355 280.0 840 2.1532 0.4495 16.2667
2.3355 281.0 843 2.1531 0.4495 16.2667
2.3355 282.0 846 2.1529 0.4495 16.2667
2.3355 283.0 849 2.1526 0.4495 16.2667
2.3355 284.0 852 2.1525 0.4495 16.2667
2.3355 285.0 855 2.1524 0.4495 16.2667
2.3355 286.0 858 2.1523 0.4495 16.2667
2.3355 287.0 861 2.1522 0.4495 16.2667
2.3355 288.0 864 2.1521 0.4495 16.2667
2.3355 289.0 867 2.1521 0.4495 16.2667
2.3355 290.0 870 2.1519 0.4495 16.2667
2.3355 291.0 873 2.1518 0.4495 16.2667
2.3355 292.0 876 2.1518 0.4495 16.2667
2.3355 293.0 879 2.1516 0.4495 16.2667
2.3355 294.0 882 2.1517 0.4495 16.2667
2.3355 295.0 885 2.1515 0.4495 16.2667
2.3355 296.0 888 2.1516 0.4495 16.2667
2.3355 297.0 891 2.1514 0.4495 16.2667
2.3355 298.0 894 2.1515 0.4495 16.2667
2.3355 299.0 897 2.1515 0.4495 16.2667
2.3355 300.0 900 2.1514 0.4495 16.2667

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1