RMWeerasinghe
/

long-t5-tglobal-base-boardpapers-4096

@@ -9,7 +9,6 @@ metrics:
 model-index:
 - name: long-t5-tglobal-base-boardpapers-4096
   results: []
-pipeline_tag: summarization
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -19,11 +18,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [RMWeerasinghe/long-t5-tglobal-base-finetuned-govReport-4096](https://huggingface.co/RMWeerasinghe/long-t5-tglobal-base-finetuned-govReport-4096) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5617
-- Rouge1: 0.0743
-- Rouge2: 0.0398
-- Rougel: 0.0589
-- Rougelsum: 0.0703
 ## Model description
@@ -50,32 +49,39 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 30
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
-| No log        | 0.67  | 1    | 0.6654          | 0.0514 | 0.0197 | 0.0386 | 0.0477    |
-| No log        | 2.0   | 3    | 0.6378          | 0.0667 | 0.0309 | 0.0512 | 0.0596    |
-| No log        | 2.67  | 4    | 0.6293          | 0.0646 | 0.0274 | 0.0515 | 0.0619    |
-| No log        | 4.0   | 6    | 0.6128          | 0.0706 | 0.0377 | 0.0566 | 0.067     |
-| No log        | 4.67  | 7    | 0.6049          | 0.0706 | 0.0377 | 0.0566 | 0.067     |
-| No log        | 6.0   | 9    | 0.5935          | 0.0706 | 0.0377 | 0.0566 | 0.067     |
-| No log        | 6.67  | 10   | 0.5891          | 0.0718 | 0.0385 | 0.0578 | 0.067     |
-| No log        | 8.0   | 12   | 0.5815          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| No log        | 8.67  | 13   | 0.5785          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| No log        | 10.0  | 15   | 0.5742          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| No log        | 10.67 | 16   | 0.5724          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| No log        | 12.0  | 18   | 0.5694          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| No log        | 12.67 | 19   | 0.5681          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| 0.7929        | 14.0  | 21   | 0.5661          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| 0.7929        | 14.67 | 22   | 0.5652          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| 0.7929        | 16.0  | 24   | 0.5636          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| 0.7929        | 16.67 | 25   | 0.5630          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| 0.7929        | 18.0  | 27   | 0.5621          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| 0.7929        | 18.67 | 28   | 0.5619          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
-| 0.7929        | 20.0  | 30   | 0.5617          | 0.0743 | 0.0398 | 0.0589 | 0.0703    |
 ### Framework versions
@@ -83,4 +89,4 @@ The following hyperparameters were used during training:
 - Transformers 4.37.0
 - Pytorch 2.1.2
 - Datasets 2.17.0
-- Tokenizers 0.15.1

 model-index:
 - name: long-t5-tglobal-base-boardpapers-4096
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [RMWeerasinghe/long-t5-tglobal-base-finetuned-govReport-4096](https://huggingface.co/RMWeerasinghe/long-t5-tglobal-base-finetuned-govReport-4096) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5356
+- Rouge1: 0.0844
+- Rouge2: 0.0543
+- Rougel: 0.0716
+- Rougelsum: 0.0842
 ## Model description
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 40
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
+| No log        | 0.67  | 1    | 0.6583          | 0.0647 | 0.03   | 0.0504 | 0.0595    |
+| No log        | 2.0   | 3    | 0.6232          | 0.067  | 0.036  | 0.0527 | 0.0643    |
+| No log        | 2.67  | 4    | 0.6134          | 0.067  | 0.036  | 0.0527 | 0.0643    |
+| No log        | 4.0   | 6    | 0.5971          | 0.0742 | 0.0426 | 0.0654 | 0.0735    |
+| No log        | 4.67  | 7    | 0.5897          | 0.0765 | 0.0462 | 0.0654 | 0.0762    |
+| No log        | 6.0   | 9    | 0.5777          | 0.0803 | 0.0486 | 0.0665 | 0.0802    |
+| No log        | 6.67  | 10   | 0.5729          | 0.0813 | 0.0498 | 0.0677 | 0.0801    |
+| No log        | 8.0   | 12   | 0.5652          | 0.0813 | 0.0498 | 0.0677 | 0.0801    |
+| No log        | 8.67  | 13   | 0.5622          | 0.0823 | 0.0544 | 0.0685 | 0.0811    |
+| No log        | 10.0  | 15   | 0.5575          | 0.0823 | 0.0544 | 0.0685 | 0.0811    |
+| No log        | 10.67 | 16   | 0.5559          | 0.0823 | 0.0544 | 0.0685 | 0.0811    |
+| No log        | 12.0  | 18   | 0.5528          | 0.0823 | 0.0544 | 0.0685 | 0.0811    |
+| No log        | 12.67 | 19   | 0.5513          | 0.0823 | 0.0544 | 0.0685 | 0.0811    |
+| 0.7235        | 14.0  | 21   | 0.5488          | 0.0823 | 0.0544 | 0.0685 | 0.0811    |
+| 0.7235        | 14.67 | 22   | 0.5476          | 0.0811 | 0.0544 | 0.0674 | 0.0794    |
+| 0.7235        | 16.0  | 24   | 0.5451          | 0.086  | 0.0574 | 0.074  | 0.0841    |
+| 0.7235        | 16.67 | 25   | 0.5438          | 0.086  | 0.0574 | 0.074  | 0.0841    |
+| 0.7235        | 18.0  | 27   | 0.5420          | 0.086  | 0.0574 | 0.074  | 0.0841    |
+| 0.7235        | 18.67 | 28   | 0.5412          | 0.086  | 0.0574 | 0.074  | 0.0841    |
+| 0.7235        | 20.0  | 30   | 0.5397          | 0.086  | 0.0574 | 0.074  | 0.0841    |
+| 0.7235        | 20.67 | 31   | 0.5390          | 0.086  | 0.0574 | 0.074  | 0.0841    |
+| 0.7235        | 22.0  | 33   | 0.5377          | 0.0844 | 0.0543 | 0.0716 | 0.0842    |
+| 0.7235        | 22.67 | 34   | 0.5372          | 0.0844 | 0.0543 | 0.0716 | 0.0842    |
+| 0.7235        | 24.0  | 36   | 0.5363          | 0.0844 | 0.0543 | 0.0716 | 0.0842    |
+| 0.7235        | 24.67 | 37   | 0.5360          | 0.0844 | 0.0543 | 0.0716 | 0.0842    |
+| 0.7235        | 26.0  | 39   | 0.5357          | 0.0844 | 0.0543 | 0.0716 | 0.0842    |
+| 0.6478        | 26.67 | 40   | 0.5356          | 0.0844 | 0.0543 | 0.0716 | 0.0842    |
 ### Framework versions
 - Transformers 4.37.0
 - Pytorch 2.1.2
 - Datasets 2.17.0
+- Tokenizers 0.15.1

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6ecdae2e04691f3beb40df306c530950a4454e32459ae9a6067150aca8e8b73e
 size 990386200

 version https://git-lfs.github.com/spec/v1
+oid sha256:0545d0943680974a6a7e6c6ff55f3bb2490f6d823804e593581e311fcf727117
 size 990386200

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:78e7af93fe1786412c9c5d0c4ae83624c02f3d1d9ead6b1478d334990a5d0abc
 size 4856

 version https://git-lfs.github.com/spec/v1
+oid sha256:3501f87da90202598dbc3f9cd899a33e587c8550eb3df15c2f091a45fb2ddbee
 size 4856