Edit model card

mt5-base-p-l-akk-en-20240709-215100

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1533

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 12
  • eval_batch_size: 12
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
30.893 0.1326 500 5.1945
2.9823 0.2651 1000 0.6672
0.6668 0.3977 1500 0.5089
0.4944 0.5302 2000 0.3100
0.3002 0.6628 2500 0.2663
0.2813 0.7953 3000 0.2493
0.273 0.9279 3500 0.2369
0.2544 1.0604 4000 0.2304
0.2445 1.1930 4500 0.2241
0.2365 1.3256 5000 0.2190
0.2305 1.4581 5500 0.2140
0.2318 1.5907 6000 0.2108
0.2166 1.7232 6500 0.2060
0.2195 1.8558 7000 0.2029
0.2125 1.9883 7500 0.2000
0.2091 2.1209 8000 0.1963
0.2092 2.2534 8500 0.1938
0.2032 2.3860 9000 0.1915
0.2018 2.5186 9500 0.1892
0.2017 2.6511 10000 0.1870
0.1961 2.7837 10500 0.1855
0.2009 2.9162 11000 0.1841
0.1956 3.0488 11500 0.1828
0.1915 3.1813 12000 0.1807
0.1892 3.3139 12500 0.1790
0.1908 3.4464 13000 0.1773
0.1834 3.5790 13500 0.1763
0.1832 3.7116 14000 0.1744
0.189 3.8441 14500 0.1734
0.1848 3.9767 15000 0.1724
0.1838 4.1092 15500 0.1715
0.177 4.2418 16000 0.1703
0.1808 4.3743 16500 0.1692
0.183 4.5069 17000 0.1680
0.1753 4.6394 17500 0.1675
0.1724 4.7720 18000 0.1666
0.1782 4.9046 18500 0.1656
0.1799 5.0371 19000 0.1653
0.1725 5.1697 19500 0.1647
0.17 5.3022 20000 0.1635
0.1722 5.4348 20500 0.1630
0.1697 5.5673 21000 0.1625
0.1719 5.6999 21500 0.1620
0.1709 5.8324 22000 0.1611
0.1727 5.9650 22500 0.1604
0.1721 6.0976 23000 0.1598
0.1681 6.2301 23500 0.1602
0.1699 6.3627 24000 0.1596
0.1639 6.4952 24500 0.1588
0.1646 6.6278 25000 0.1584
0.1691 6.7603 25500 0.1582
0.1653 6.8929 26000 0.1574
0.1648 7.0255 26500 0.1572
0.1669 7.1580 27000 0.1569
0.16 7.2906 27500 0.1568
0.1622 7.4231 28000 0.1562
0.1644 7.5557 28500 0.1561
0.1674 7.6882 29000 0.1557
0.1628 7.8208 29500 0.1552
0.1619 7.9533 30000 0.1551
0.1636 8.0859 30500 0.1549
0.1629 8.2185 31000 0.1546
0.1632 8.3510 31500 0.1545
0.1641 8.4836 32000 0.1543
0.1592 8.6161 32500 0.1541
0.1573 8.7487 33000 0.1539
0.1607 8.8812 33500 0.1540
0.1651 9.0138 34000 0.1537
0.1551 9.1463 34500 0.1537
0.1621 9.2789 35000 0.1536
0.166 9.4115 35500 0.1534
0.1575 9.5440 36000 0.1534
0.1607 9.6766 36500 0.1534
0.1627 9.8091 37000 0.1533
0.1608 9.9417 37500 0.1533

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.5.0.dev20240625
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
11
Safetensors
Model size
604M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Thalesian/mt5-base-p-l-akk-en-20240709-215100

Base model

google/mt5-base
Finetuned
(158)
this model