samu commited on
Commit
2b9248a
·
verified ·
1 Parent(s): 36d1a0e

Training complete

Browse files
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  license: apache-2.0
3
  base_model: google-t5/t5-small
4
  tags:
@@ -18,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.9789
22
- - Bleu: 26.7998
23
- - Gen Len: 14.732
24
 
25
  ## Model description
26
 
@@ -39,54 +40,44 @@ More information needed
39
  ### Training hyperparameters
40
 
41
  The following hyperparameters were used during training:
42
- - learning_rate: 2e-05
43
  - train_batch_size: 16
44
  - eval_batch_size: 16
45
  - seed: 42
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
- - num_epochs: 30
49
  - mixed_precision_training: Native AMP
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
54
  |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
55
- | 3.2669 | 1.0 | 1497 | 2.5511 | 0.5792 | 17.0373 |
56
- | 2.6176 | 2.0 | 2994 | 2.1280 | 2.8948 | 15.7297 |
57
- | 2.2969 | 3.0 | 4491 | 1.8747 | 5.9112 | 15.356 |
58
- | 2.0747 | 4.0 | 5988 | 1.6995 | 8.9433 | 15.1953 |
59
- | 1.9119 | 5.0 | 7485 | 1.5705 | 11.4531 | 15.1271 |
60
- | 1.7829 | 6.0 | 8982 | 1.4784 | 13.6106 | 14.8861 |
61
- | 1.6811 | 7.0 | 10479 | 1.3997 | 15.2789 | 14.9873 |
62
- | 1.6227 | 8.0 | 11976 | 1.3385 | 16.6919 | 14.9495 |
63
- | 1.5424 | 9.0 | 13473 | 1.2955 | 18.2538 | 14.786 |
64
- | 1.4964 | 10.0 | 14970 | 1.2492 | 19.328 | 14.9093 |
65
- | 1.441 | 11.0 | 16467 | 1.2104 | 20.1677 | 14.9034 |
66
- | 1.396 | 12.0 | 17964 | 1.1784 | 21.2916 | 15.0312 |
67
- | 1.3596 | 13.0 | 19461 | 1.1498 | 21.7787 | 14.84 |
68
- | 1.3364 | 14.0 | 20958 | 1.1255 | 22.5973 | 14.8233 |
69
- | 1.2971 | 15.0 | 22455 | 1.1044 | 23.0769 | 14.8186 |
70
- | 1.2657 | 16.0 | 23952 | 1.0858 | 23.6644 | 14.7918 |
71
- | 1.2573 | 17.0 | 25449 | 1.0718 | 24.3479 | 14.7274 |
72
- | 1.2421 | 18.0 | 26946 | 1.0536 | 24.8945 | 14.7964 |
73
- | 1.2192 | 19.0 | 28443 | 1.0415 | 25.0471 | 14.7531 |
74
- | 1.2196 | 20.0 | 29940 | 1.0278 | 25.4159 | 14.8381 |
75
- | 1.1853 | 21.0 | 31437 | 1.0194 | 25.5894 | 14.7703 |
76
- | 1.1863 | 22.0 | 32934 | 1.0119 | 25.9677 | 14.6953 |
77
- | 1.1765 | 23.0 | 34431 | 1.0007 | 26.1163 | 14.7952 |
78
- | 1.1631 | 24.0 | 35928 | 0.9963 | 26.223 | 14.7399 |
79
- | 1.1578 | 25.0 | 37425 | 0.9916 | 26.3776 | 14.6981 |
80
- | 1.1416 | 26.0 | 38922 | 0.9859 | 26.5215 | 14.7603 |
81
- | 1.1411 | 27.0 | 40419 | 0.9842 | 26.7362 | 14.7282 |
82
- | 1.1327 | 28.0 | 41916 | 0.9808 | 26.8666 | 14.7357 |
83
- | 1.1194 | 29.0 | 43413 | 0.9801 | 26.839 | 14.7264 |
84
- | 1.1229 | 30.0 | 44910 | 0.9789 | 26.7998 | 14.732 |
85
 
86
 
87
  ### Framework versions
88
 
89
- - Transformers 4.44.0
90
  - Pytorch 2.3.1+cu121
91
  - Datasets 2.21.0
92
  - Tokenizers 0.19.1
 
1
  ---
2
+ library_name: transformers
3
  license: apache-2.0
4
  base_model: google-t5/t5-small
5
  tags:
 
19
 
20
  This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.6851
23
+ - Bleu: 71.9442
24
+ - Gen Len: 14.3679
25
 
26
  ## Model description
27
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
+ - learning_rate: 0.0008
44
  - train_batch_size: 16
45
  - eval_batch_size: 16
46
  - seed: 42
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
49
+ - num_epochs: 20
50
  - mixed_precision_training: Native AMP
51
 
52
  ### Training results
53
 
54
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
55
  |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
56
+ | 1.2594 | 1.0 | 1497 | 0.8236 | 59.41 | 14.2172 |
57
+ | 0.7848 | 2.0 | 2994 | 0.6581 | 64.4839 | 14.219 |
58
+ | 0.6172 | 3.0 | 4491 | 0.5897 | 66.4564 | 14.2357 |
59
+ | 0.5151 | 4.0 | 5988 | 0.5619 | 68.0986 | 14.4905 |
60
+ | 0.4457 | 5.0 | 7485 | 0.5477 | 69.2175 | 14.4141 |
61
+ | 0.3938 | 6.0 | 8982 | 0.5413 | 70.0663 | 14.4059 |
62
+ | 0.3555 | 7.0 | 10479 | 0.5338 | 70.1734 | 14.4734 |
63
+ | 0.3154 | 8.0 | 11976 | 0.5485 | 70.3692 | 14.3035 |
64
+ | 0.2837 | 9.0 | 13473 | 0.5454 | 70.7837 | 14.4556 |
65
+ | 0.2507 | 10.0 | 14970 | 0.5616 | 70.976 | 14.3807 |
66
+ | 0.2265 | 11.0 | 16467 | 0.5728 | 71.2008 | 14.3692 |
67
+ | 0.2041 | 12.0 | 17964 | 0.5808 | 71.4766 | 14.362 |
68
+ | 0.1848 | 13.0 | 19461 | 0.5981 | 71.3804 | 14.3114 |
69
+ | 0.1715 | 14.0 | 20958 | 0.6122 | 71.43 | 14.4295 |
70
+ | 0.1547 | 15.0 | 22455 | 0.6309 | 71.753 | 14.351 |
71
+ | 0.1417 | 16.0 | 23952 | 0.6411 | 71.7608 | 14.3513 |
72
+ | 0.1267 | 17.0 | 25449 | 0.6612 | 71.93 | 14.4243 |
73
+ | 0.1208 | 18.0 | 26946 | 0.6662 | 71.8591 | 14.3486 |
74
+ | 0.1076 | 19.0 | 28443 | 0.6799 | 72.0417 | 14.3862 |
75
+ | 0.1046 | 20.0 | 29940 | 0.6851 | 71.9442 | 14.3679 |
 
 
 
 
 
 
 
 
 
 
76
 
77
 
78
  ### Framework versions
79
 
80
+ - Transformers 4.44.2
81
  - Pytorch 2.3.1+cu121
82
  - Datasets 2.21.0
83
  - Tokenizers 0.19.1
generation_config.json CHANGED
@@ -2,5 +2,5 @@
2
  "decoder_start_token_id": 0,
3
  "eos_token_id": 1,
4
  "pad_token_id": 0,
5
- "transformers_version": "4.44.0"
6
  }
 
2
  "decoder_start_token_id": 0,
3
  "eos_token_id": 1,
4
  "pad_token_id": 0,
5
+ "transformers_version": "4.44.2"
6
  }
runs/Aug23_01-11-55_9a99a79954bc/events.out.tfevents.1724375517.9a99a79954bc.263.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b535d1b0ddfa5332da73ef5a457d015b3859f57f60711e214364e7dc855897a8
3
- size 25606
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b28aaf92bdce543a8b31b835d492b6ef91f60a5edefdf0520008164eba883b7
3
+ size 26343