ubaada commited on
Commit
67b19bd
1 Parent(s): c46c078

ubaada/pegasus-x-large-booksum-16k

Browse files
Files changed (6) hide show
  1. README.md +17 -40
  2. config.json +1 -1
  3. model.safetensors +1 -1
  4. tokenizer.json +4 -2
  5. tokenizer_config.json +8 -1
  6. training_args.bin +2 -2
README.md CHANGED
@@ -1,38 +1,12 @@
1
  ---
2
- base_model: google/pegasus-x-large
3
  tags:
4
- - summarization
5
- - summary
6
- - booksum
7
- - long-document
8
- - long-form
9
- datasets:
10
- - ubaada/booksum-complete-cleaned
11
- language:
12
- - en
13
- pipeline_tag: summarization
14
  metrics:
15
- - rouge
16
  model-index:
17
- - name: ubaada/pegasus-x-large-booksum-16k
18
- results:
19
- - task:
20
- type: summarization
21
- name: Summarization
22
- dataset:
23
- name: BookSum
24
- type: ubaada/booksum-complete-cleaned
25
- config: ubaada--booksum
26
- split: test
27
- metrics:
28
- - type: rouge
29
- value: 30.947853
30
- name: ROUGE-1
31
- verified: false
32
- - type: rouge
33
- value: 5.568146
34
- name: ROUGE-2
35
- verified: false
36
  ---
37
 
38
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -40,12 +14,12 @@ should probably proofread and complete it, then remove this comment. -->
40
 
41
  # pegasus-x-large-booksum-16k
42
 
43
- This model is a fine-tuned version of [google/pegasus-x-large](https://huggingface.co/google/pegasus-x-large) on [ubaada/booksum-complete-cleaned](https://huggingface.co/datasets/ubaada/booksum-complete-cleaned). It was trained on the 'train' split of chapters sub-dataset.
44
  It achieves the following results on the evaluation set:
45
- - Loss: 1.9677
46
- - Rouge1: 0.3504
47
- - Rouge2: 0.0525
48
- - Rougel: 0.1398
49
 
50
  ## Model description
51
 
@@ -64,12 +38,15 @@ More information needed
64
  ### Training hyperparameters
65
 
66
  The following hyperparameters were used during training:
67
- - learning_rate: 8e-05
68
  - train_batch_size: 8
69
  - eval_batch_size: 1
70
  - seed: 42
 
 
71
  - gradient_accumulation_steps: 4
72
- - total_train_batch_size: 32
 
73
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
74
  - lr_scheduler_type: linear
75
  - num_epochs: 1
@@ -78,7 +55,7 @@ The following hyperparameters were used during training:
78
 
79
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel |
80
  |:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|
81
- | 1.417 | 0.9996 | 628 | 1.9677 | 0.3504 | 0.0525 | 0.1398 |
82
 
83
 
84
  ### Framework versions
@@ -86,4 +63,4 @@ The following hyperparameters were used during training:
86
  - Transformers 4.40.2
87
  - Pytorch 2.2.0
88
  - Datasets 2.19.1
89
- - Tokenizers 0.19.1
 
1
  ---
2
+ base_model: ubaada/pegasus-x-large-booksum-16k
3
  tags:
4
+ - generated_from_trainer
 
 
 
 
 
 
 
 
 
5
  metrics:
6
+ - rouge
7
  model-index:
8
+ - name: pegasus-x-large-booksum-16k
9
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
14
 
15
  # pegasus-x-large-booksum-16k
16
 
17
+ This model is a fine-tuned version of [ubaada/pegasus-x-large-booksum-16k](https://huggingface.co/ubaada/pegasus-x-large-booksum-16k) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 1.9879
20
+ - Rouge1: 0.2983
21
+ - Rouge2: 0.0463
22
+ - Rougel: 0.1367
23
 
24
  ## Model description
25
 
 
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
+ - learning_rate: 4e-05
42
  - train_batch_size: 8
43
  - eval_batch_size: 1
44
  - seed: 42
45
+ - distributed_type: multi-GPU
46
+ - num_devices: 2
47
  - gradient_accumulation_steps: 4
48
+ - total_train_batch_size: 64
49
+ - total_eval_batch_size: 2
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
  - lr_scheduler_type: linear
52
  - num_epochs: 1
 
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel |
57
  |:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|
58
+ | 1.3846 | 0.9992 | 314 | 1.9879 | 0.2983 | 0.0463 | 0.1367 |
59
 
60
 
61
  ### Framework versions
 
63
  - Transformers 4.40.2
64
  - Pytorch 2.2.0
65
  - Datasets 2.19.1
66
+ - Tokenizers 0.19.1
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "google/pegasus-x-large",
3
  "activation_dropout": 0.1,
4
  "activation_function": "relu",
5
  "add_bias_logits": false,
 
1
  {
2
+ "_name_or_path": "ubaada/pegasus-x-large-booksum-16k",
3
  "activation_dropout": 0.1,
4
  "activation_function": "relu",
5
  "add_bias_logits": false,
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:05bc6703ce0f79f18f042278837400e453556b0eb53294f71494d559215d3ec4
3
  size 2274730128
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf3259de6f0e881dfe94b5afa80f69b4c39f1896ae8744ea7a32db06b5ddb85a
3
  size 2274730128
tokenizer.json CHANGED
@@ -2,12 +2,14 @@
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
- "max_length": 16384,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
9
  "padding": {
10
- "strategy": "BatchLongest",
 
 
11
  "direction": "Right",
12
  "pad_to_multiple_of": null,
13
  "pad_id": 0,
 
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
+ "max_length": 1024,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
9
  "padding": {
10
+ "strategy": {
11
+ "Fixed": 1024
12
+ },
13
  "direction": "Right",
14
  "pad_to_multiple_of": null,
15
  "pad_id": 0,
tokenizer_config.json CHANGED
@@ -958,10 +958,17 @@
958
  "full_tokenizer_file": null,
959
  "mask_token": "<mask_2>",
960
  "mask_token_sent": "<mask_1>",
961
- "model_max_length": 16384,
 
962
  "offset": 103,
 
963
  "pad_token": "<pad>",
 
 
964
  "sp_model_kwargs": {},
 
965
  "tokenizer_class": "PegasusTokenizer",
 
 
966
  "unk_token": "<unk>"
967
  }
 
958
  "full_tokenizer_file": null,
959
  "mask_token": "<mask_2>",
960
  "mask_token_sent": "<mask_1>",
961
+ "max_length": 16384,
962
+ "model_max_length": 12288,
963
  "offset": 103,
964
+ "pad_to_multiple_of": null,
965
  "pad_token": "<pad>",
966
+ "pad_token_type_id": 0,
967
+ "padding_side": "right",
968
  "sp_model_kwargs": {},
969
+ "stride": 0,
970
  "tokenizer_class": "PegasusTokenizer",
971
+ "truncation_side": "right",
972
+ "truncation_strategy": "longest_first",
973
  "unk_token": "<unk>"
974
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8d4c0874fe606601da864bbc94c7fad30e9f3c27fcb820c0b4b500abef9ac29b
3
- size 6712
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c04a4fc24c48fd0c95b4c7009572759f86af67b73d4e2c597e72560a7a1f9afc
3
+ size 6776