Yova
/

baseline

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.4307
 - Exact Match: 0.0
 ## Model description
@@ -34,41 +34,37 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.001
-- train_batch_size: 100
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 400
-- optimizer: Adam with betas=(0.98,0.999) and epsilon=1e-08
-- lr_scheduler_type: inverse_sqrt
-- lr_scheduler_warmup_steps: 4000
 - num_epochs: 20
-- label_smoothing_factor: 0.1
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Exact Match |
 |:-------------:|:-----:|:----:|:---------------:|:-----------:|
-| 6.3427        | 1.0   | 25   | 5.7961          | 0.0         |
-| 5.7757        | 2.0   | 50   | 4.8370          | 0.0         |
-| 5.0702        | 3.0   | 75   | 4.3284          | 0.0         |
-| 4.6064        | 4.0   | 100  | 4.0447          | 0.0         |
-| 4.3415        | 5.0   | 125  | 3.8892          | 0.0         |
-| 4.1863        | 6.0   | 150  | 3.7803          | 0.0         |
-| 4.0684        | 7.0   | 175  | 3.6724          | 0.0         |
-| 3.9449        | 8.0   | 200  | 3.5356          | 0.0         |
-| 3.7922        | 9.0   | 225  | 3.3716          | 0.0         |
-| 3.6343        | 10.0  | 250  | 3.2232          | 0.0         |
-| 3.4833        | 11.0  | 275  | 3.0938          | 0.0         |
-| 3.3374        | 12.0  | 300  | 2.9880          | 0.0         |
-| 3.2006        | 13.0  | 325  | 2.8960          | 0.0         |
-| 3.0796        | 14.0  | 350  | 2.8199          | 0.0         |
-| 2.9739        | 15.0  | 375  | 2.7270          | 0.0         |
-| 2.8824        | 16.0  | 400  | 2.6369          | 0.0         |
-| 2.8           | 17.0  | 425  | 2.5811          | 0.0         |
-| 2.7323        | 18.0  | 450  | 2.5184          | 0.0         |
-| 2.6596        | 19.0  | 475  | 2.4946          | 0.0         |
-| 2.5931        | 20.0  | 500  | 2.4307          | 0.0         |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.7391
 - Exact Match: 0.0
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.001
+- train_batch_size: 64
 - eval_batch_size: 8
 - seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
 - num_epochs: 20
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Exact Match |
 |:-------------:|:-----:|:----:|:---------------:|:-----------:|
+| 2.0707        | 1.0   | 157  | 1.4872          | 0.003       |
+| 1.0627        | 2.0   | 314  | 1.2868          | 0.007       |
+| 0.8185        | 3.0   | 471  | 1.2750          | 0.0         |
+| 0.6815        | 4.0   | 628  | 1.4788          | 0.0         |
+| 0.6021        | 5.0   | 785  | 1.3314          | 0.003       |
+| 0.5331        | 6.0   | 942  | 1.4174          | 0.0         |
+| 0.481         | 7.0   | 1099 | 1.5780          | 0.0         |
+| 0.442         | 8.0   | 1256 | 1.4797          | 0.0         |
+| 0.4027        | 9.0   | 1413 | 1.5298          | 0.0         |
+| 0.3629        | 10.0  | 1570 | 1.5906          | 0.0         |
+| 0.3313        | 11.0  | 1727 | 1.6076          | 0.0         |
+| 0.3031        | 12.0  | 1884 | 1.7169          | 0.0         |
+| 0.269         | 13.0  | 2041 | 1.5874          | 0.0         |
+| 0.2433        | 14.0  | 2198 | 1.7706          | 0.0         |
+| 0.2079        | 15.0  | 2355 | 1.6666          | 0.0         |
+| 0.1772        | 16.0  | 2512 | 1.6823          | 0.0         |
+| 0.1549        | 17.0  | 2669 | 1.7645          | 0.0         |
+| 0.1345        | 18.0  | 2826 | 1.7436          | 0.0         |
+| 0.124         | 19.0  | 2983 | 1.7963          | 0.0         |
+| 0.1166        | 20.0  | 3140 | 1.7391          | 0.0         |
 ### Framework versions

generation_config.json CHANGED Viewed

@@ -2,7 +2,6 @@
   "decoder_start_token_id": 259,
   "eos_token_id": 1,
   "max_new_tokens": 20,
-  "num_beams": 5,
   "pad_token_id": 0,
   "transformers_version": "4.35.2"
 }

   "decoder_start_token_id": 259,
   "eos_token_id": 1,
   "max_new_tokens": 20,
   "pad_token_id": 0,
   "transformers_version": "4.35.2"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f833b0041de4fd8abd5a25e09c5ab6b1f6d844090494de955b8ea1aa36cc8c2f
 size 20676648

 version https://git-lfs.github.com/spec/v1
+oid sha256:9d40e6ab7ebe39f319d71473c6cfb45f2201045e08ffd62fadc62b6f6b74a8f6
 size 20676648