Alfahluzi/bert2bert-dropout-0.3-lr-5e-05-ds-canonical train with xtreme data 5 epochs

Files changed (4) hide show

README.md CHANGED Viewed

@@ -15,10 +15,10 @@ should probably proofread and complete it, then remove this comment. -->
 This model was trained from scratch on the id_liputan6 dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.2773
-- Rouge2 Precision: 0.2517
-- Rouge2 Recall: 0.2518
-- Rouge2 Fmeasure: 0.2517
 ## Model description
@@ -38,24 +38,28 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 48
-- eval_batch_size: 48
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 1
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge2 Precision | Rouge2 Recall | Rouge2 Fmeasure |
 |:-------------:|:-----:|:----:|:---------------:|:----------------:|:-------------:|:---------------:|
-| 1.3309        | 1.0   | 4040 | 2.2773          | 0.2517           | 0.2518         | 0.2517          |
 ### Framework versions
-- Transformers 4.37.0
-- Pytorch 2.1.2
-- Datasets 2.16.1
-- Tokenizers 0.15.1

 This model was trained from scratch on the id_liputan6 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.4528
+- Rouge2 Precision: 0.1519
+- Rouge2 Recall: 0.1497
+- Rouge2 Fmeasure: 0.1496
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge2 Precision | Rouge2 Recall | Rouge2 Fmeasure |
 |:-------------:|:-----:|:----:|:---------------:|:----------------:|:-------------:|:---------------:|
+| 2.6029        | 1.0   | 619  | 2.2821          | 0.1605           | 0.1573        | 0.1576          |
+| 2.2629        | 2.0   | 1238 | 2.3146          | 0.1566           | 0.1588        | 0.1565          |
+| 2.028         | 3.0   | 1857 | 2.3690          | 0.1548           | 0.1513        | 0.1519          |
+| 1.8267        | 4.0   | 2476 | 2.4174          | 0.1527           | 0.1491        | 0.1497          |
+| 1.5451        | 5.0   | 3095 | 2.4528          | 0.1519           | 0.1497        | 0.1496          |
 ### Framework versions
+- Transformers 4.37.2
+- Pytorch 2.1.0+cu121
+- Datasets 2.17.1
+- Tokenizers 0.15.2

generation_config.json CHANGED Viewed

@@ -7,8 +7,7 @@
   "max_length": 100,
   "min_length": 10,
   "no_repeat_ngram_size": 3,
-  "num_beams": 10,
   "pad_token_id": 0,
-  "transformers_version": "4.37.0",
-  "temperature": 0.5,
 }

   "max_length": 100,
   "min_length": 10,
   "no_repeat_ngram_size": 3,
+  "num_beams": 4,
   "pad_token_id": 0,
+  "transformers_version": "4.37.2"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2ba086cd9c0bdfeac7a17bf77a8a6e3c6edb8eb096dd1a39313d20675658e83d
 size 998132132

 version https://git-lfs.github.com/spec/v1
+oid sha256:477588e80d9813c4f5428a189c3650d51255f0fb0bf2b07493ff9f73fedcc2bd
 size 998132132

runs/Feb28_17-13-49_89c71633e4f4/events.out.tfevents.1709140430.89c71633e4f4.1578.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f5b208088506b1e67fb4b15def780bab49dc1cc97c420f09ff848526bfc71b70
-size 10907

 version https://git-lfs.github.com/spec/v1
+oid sha256:aa6f6f5571ebdaa99b1b463a162b2a787af67737f1c74e6ad2e9df005e27f5fa
+size 12469