Yova
/

baseline

@@ -13,8 +13,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6800
-- Exact Match: 0.165
 ## Model description
@@ -39,26 +39,117 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 10
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Exact Match |
-|:-------------:|:-----:|:----:|:---------------:|:-----------:|
-| 2.1419        | 1.0   | 313  | 1.3197          | 0.01        |
-| 1.1804        | 2.0   | 626  | 1.0833          | 0.023       |
-| 0.8888        | 3.0   | 939  | 0.9529          | 0.062       |
-| 0.7167        | 4.0   | 1252 | 0.9488          | 0.052       |
-| 0.5985        | 5.0   | 1565 | 0.8626          | 0.09        |
-| 0.514         | 6.0   | 1878 | 0.7982          | 0.115       |
-| 0.439         | 7.0   | 2191 | 0.7420          | 0.144       |
-| 0.3851        | 8.0   | 2504 | 0.6876          | 0.15        |
-| 0.3434        | 9.0   | 2817 | 0.6794          | 0.178       |
-| 0.3174        | 10.0  | 3130 | 0.6800          | 0.165       |
 ### Framework versions
 - Transformers 4.35.2
 - Pytorch 2.1.0+cu118
 - Tokenizers 0.15.0

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.6338
+- Exact Match: 0.142
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 100
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Exact Match |
+|:-------------:|:-----:|:-----:|:---------------:|:-----------:|
+| 2.6989        | 1.0   | 313   | 1.8586          | 0.0         |
+| 1.834         | 2.0   | 626   | 1.5284          | 0.003       |
+| 1.517         | 3.0   | 939   | 1.3632          | 0.005       |
+| 1.2977        | 4.0   | 1252  | 1.2077          | 0.021       |
+| 1.124         | 5.0   | 1565  | 1.1030          | 0.037       |
+| 0.9885        | 6.0   | 1878  | 1.0607          | 0.05        |
+| 0.8762        | 7.0   | 2191  | 1.0329          | 0.047       |
+| 0.7698        | 8.0   | 2504  | 1.0087          | 0.063       |
+| 0.6983        | 9.0   | 2817  | 0.9963          | 0.046       |
+| 0.6297        | 10.0  | 3130  | 0.9754          | 0.076       |
+| 0.5719        | 11.0  | 3443  | 0.9907          | 0.075       |
+| 0.5247        | 12.0  | 3756  | 0.9777          | 0.069       |
+| 0.4776        | 13.0  | 4069  | 0.9766          | 0.055       |
+| 0.442         | 14.0  | 4382  | 0.9953          | 0.091       |
+| 0.4081        | 15.0  | 4695  | 1.0005          | 0.098       |
+| 0.3783        | 16.0  | 5008  | 1.0274          | 0.093       |
+| 0.3545        | 17.0  | 5321  | 1.0516          | 0.087       |
+| 0.3243        | 18.0  | 5634  | 1.0339          | 0.09        |
+| 0.3045        | 19.0  | 5947  | 1.0419          | 0.078       |
+| 0.2841        | 20.0  | 6260  | 1.0640          | 0.087       |
+| 0.2692        | 21.0  | 6573  | 1.0839          | 0.105       |
+| 0.2543        | 22.0  | 6886  | 1.1608          | 0.064       |
+| 0.2346        | 23.0  | 7199  | 1.1046          | 0.113       |
+| 0.2245        | 24.0  | 7512  | 1.1569          | 0.128       |
+| 0.2135        | 25.0  | 7825  | 1.1242          | 0.108       |
+| 0.2029        | 26.0  | 8138  | 1.1436          | 0.118       |
+| 0.1902        | 27.0  | 8451  | 1.2023          | 0.095       |
+| 0.1832        | 28.0  | 8764  | 1.1556          | 0.115       |
+| 0.171         | 29.0  | 9077  | 1.2068          | 0.094       |
+| 0.1639        | 30.0  | 9390  | 1.2101          | 0.151       |
+| 0.1581        | 31.0  | 9703  | 1.2299          | 0.112       |
+| 0.1504        | 32.0  | 10016 | 1.3153          | 0.1         |
+| 0.1463        | 33.0  | 10329 | 1.2785          | 0.091       |
+| 0.1405        | 34.0  | 10642 | 1.2662          | 0.111       |
+| 0.1349        | 35.0  | 10955 | 1.2805          | 0.134       |
+| 0.1291        | 36.0  | 11268 | 1.2516          | 0.137       |
+| 0.126         | 37.0  | 11581 | 1.3312          | 0.141       |
+| 0.1204        | 38.0  | 11894 | 1.2776          | 0.116       |
+| 0.1163        | 39.0  | 12207 | 1.3203          | 0.11        |
+| 0.114         | 40.0  | 12520 | 1.3212          | 0.129       |
+| 0.1056        | 41.0  | 12833 | 1.3291          | 0.127       |
+| 0.1033        | 42.0  | 13146 | 1.3010          | 0.125       |
+| 0.1034        | 43.0  | 13459 | 1.3206          | 0.135       |
+| 0.098         | 44.0  | 13772 | 1.3879          | 0.127       |
+| 0.0951        | 45.0  | 14085 | 1.3693          | 0.111       |
+| 0.089         | 46.0  | 14398 | 1.4261          | 0.124       |
+| 0.0913        | 47.0  | 14711 | 1.3644          | 0.122       |
+| 0.0863        | 48.0  | 15024 | 1.4392          | 0.108       |
+| 0.0809        | 49.0  | 15337 | 1.3726          | 0.098       |
+| 0.0795        | 50.0  | 15650 | 1.3791          | 0.084       |
+| 0.0763        | 51.0  | 15963 | 1.3911          | 0.134       |
+| 0.0768        | 52.0  | 16276 | 1.4202          | 0.104       |
+| 0.076         | 53.0  | 16589 | 1.4594          | 0.122       |
+| 0.0734        | 54.0  | 16902 | 1.4541          | 0.129       |
+| 0.0714        | 55.0  | 17215 | 1.4032          | 0.133       |
+| 0.0696        | 56.0  | 17528 | 1.4467          | 0.128       |
+| 0.0674        | 57.0  | 17841 | 1.4952          | 0.103       |
+| 0.0657        | 58.0  | 18154 | 1.4582          | 0.14        |
+| 0.0658        | 59.0  | 18467 | 1.4619          | 0.121       |
+| 0.061         | 60.0  | 18780 | 1.5447          | 0.111       |
+| 0.0609        | 61.0  | 19093 | 1.4233          | 0.16        |
+| 0.0596        | 62.0  | 19406 | 1.4705          | 0.134       |
+| 0.058         | 63.0  | 19719 | 1.4721          | 0.144       |
+| 0.0555        | 64.0  | 20032 | 1.4377          | 0.156       |
+| 0.0532        | 65.0  | 20345 | 1.5016          | 0.125       |
+| 0.0559        | 66.0  | 20658 | 1.5405          | 0.156       |
+| 0.0517        | 67.0  | 20971 | 1.5166          | 0.133       |
+| 0.0499        | 68.0  | 21284 | 1.4787          | 0.139       |
+| 0.0477        | 69.0  | 21597 | 1.5063          | 0.124       |
+| 0.0491        | 70.0  | 21910 | 1.5287          | 0.147       |
+| 0.0464        | 71.0  | 22223 | 1.5428          | 0.131       |
+| 0.0456        | 72.0  | 22536 | 1.5434          | 0.132       |
+| 0.0449        | 73.0  | 22849 | 1.5364          | 0.116       |
+| 0.0432        | 74.0  | 23162 | 1.5830          | 0.12        |
+| 0.042         | 75.0  | 23475 | 1.5508          | 0.113       |
+| 0.0403        | 76.0  | 23788 | 1.5146          | 0.134       |
+| 0.0398        | 77.0  | 24101 | 1.5955          | 0.111       |
+| 0.0412        | 78.0  | 24414 | 1.5759          | 0.132       |
+| 0.0391        | 79.0  | 24727 | 1.5588          | 0.136       |
+| 0.0383        | 80.0  | 25040 | 1.5580          | 0.141       |
+| 0.0366        | 81.0  | 25353 | 1.5895          | 0.143       |
+| 0.0365        | 82.0  | 25666 | 1.5637          | 0.148       |
+| 0.035         | 83.0  | 25979 | 1.6012          | 0.155       |
+| 0.0359        | 84.0  | 26292 | 1.6130          | 0.118       |
+| 0.0343        | 85.0  | 26605 | 1.6038          | 0.158       |
+| 0.0333        | 86.0  | 26918 | 1.6300          | 0.124       |
+| 0.0318        | 87.0  | 27231 | 1.6259          | 0.145       |
+| 0.0309        | 88.0  | 27544 | 1.6178          | 0.139       |
+| 0.0303        | 89.0  | 27857 | 1.6166          | 0.143       |
+| 0.0302        | 90.0  | 28170 | 1.6394          | 0.141       |
+| 0.0293        | 91.0  | 28483 | 1.6408          | 0.154       |
+| 0.0281        | 92.0  | 28796 | 1.6424          | 0.13        |
+| 0.0288        | 93.0  | 29109 | 1.6426          | 0.136       |
+| 0.0272        | 94.0  | 29422 | 1.6477          | 0.131       |
+| 0.0278        | 95.0  | 29735 | 1.6288          | 0.142       |
+| 0.0264        | 96.0  | 30048 | 1.6251          | 0.142       |
+| 0.0268        | 97.0  | 30361 | 1.6340          | 0.142       |
+| 0.0255        | 98.0  | 30674 | 1.6353          | 0.145       |
+| 0.0263        | 99.0  | 30987 | 1.6333          | 0.143       |
+| 0.0259        | 100.0 | 31300 | 1.6338          | 0.142       |
 ### Framework versions
 - Transformers 4.35.2
 - Pytorch 2.1.0+cu118
+- Datasets 2.15.0
 - Tokenizers 0.15.0

generation_config.json CHANGED Viewed

@@ -1,6 +1,7 @@
 {
   "decoder_start_token_id": 259,
   "eos_token_id": 1,
   "pad_token_id": 0,
   "transformers_version": "4.35.2"
 }

 {
   "decoder_start_token_id": 259,
   "eos_token_id": 1,
+  "max_new_tokens": 20,
   "pad_token_id": 0,
   "transformers_version": "4.35.2"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ad9a0a7b27ded206f1f66be5ddf8061ce20fde4766abf3d47d07e2d5b5bd1005
 size 10059592

 version https://git-lfs.github.com/spec/v1
+oid sha256:9359a363befd957da7b42b05a48bf15d7ebfc123ecab83e890d068125fcff0ff
 size 10059592