File size: 4,622 Bytes
cbdf9f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21db756
 
 
 
cbdf9f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21db756
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cbdf9f8
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
license: apache-2.0
base_model: google/flan-t5-small
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: flant5-small
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# flant5-small

This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2540
- Rouge1: 39.4088
- Rouge2: 17.6509
- Rougel: 34.241
- Rougelsum: 36.3257
- Gen Len: 19.97

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 0.3676        | 1.0   | 1557  | 0.2753          | 36.6587 | 14.0807 | 31.2838 | 33.1779   | 19.942  |
| 0.3135        | 2.0   | 3115  | 0.2658          | 37.8343 | 15.2461 | 32.2261 | 34.1553   | 19.97   |
| 0.2992        | 3.0   | 4672  | 0.2596          | 38.3851 | 15.6982 | 32.5124 | 34.5772   | 19.942  |
| 0.2888        | 4.0   | 6230  | 0.2559          | 37.6648 | 15.1146 | 32.1953 | 34.2139   | 19.94   |
| 0.281         | 5.0   | 7787  | 0.2549          | 38.3654 | 15.8444 | 32.775  | 34.9156   | 19.952  |
| 0.2734        | 6.0   | 9345  | 0.2533          | 38.7474 | 16.0237 | 33.155  | 35.3048   | 19.954  |
| 0.2679        | 7.0   | 10902 | 0.2529          | 38.7094 | 16.1904 | 33.2149 | 35.2449   | 19.96   |
| 0.2619        | 8.0   | 12460 | 0.2528          | 39.034  | 16.4682 | 33.7757 | 35.82     | 19.968  |
| 0.2576        | 9.0   | 14017 | 0.2528          | 38.769  | 16.5015 | 33.3685 | 35.4211   | 19.948  |
| 0.253         | 10.0  | 15575 | 0.2523          | 38.5811 | 16.3423 | 33.2559 | 35.2143   | 19.956  |
| 0.2494        | 11.0  | 17132 | 0.2516          | 38.7084 | 16.5171 | 33.4486 | 35.5503   | 19.958  |
| 0.2456        | 12.0  | 18690 | 0.2514          | 38.3763 | 16.2338 | 33.1431 | 34.8647   | 19.964  |
| 0.2419        | 13.0  | 20247 | 0.2520          | 38.455  | 16.2491 | 32.9546 | 35.0263   | 19.972  |
| 0.2388        | 14.0  | 21805 | 0.2514          | 38.9372 | 17.1821 | 33.6449 | 35.5621   | 19.97   |
| 0.2363        | 15.0  | 23362 | 0.2530          | 38.9104 | 16.742  | 33.5194 | 35.3391   | 19.976  |
| 0.2336        | 16.0  | 24920 | 0.2519          | 38.8698 | 16.9396 | 33.7987 | 35.6173   | 19.958  |
| 0.2313        | 17.0  | 26477 | 0.2518          | 38.8774 | 17.0545 | 33.7151 | 35.6844   | 19.97   |
| 0.229         | 18.0  | 28035 | 0.2518          | 38.7073 | 16.7039 | 33.4976 | 35.4177   | 19.964  |
| 0.2272        | 19.0  | 29592 | 0.2522          | 39.0868 | 16.948  | 33.8953 | 35.8788   | 19.964  |
| 0.2252        | 20.0  | 31150 | 0.2527          | 38.7854 | 16.9882 | 33.8017 | 35.6314   | 19.968  |
| 0.2234        | 21.0  | 32707 | 0.2527          | 38.9196 | 17.1419 | 33.9139 | 35.8599   | 19.97   |
| 0.2217        | 22.0  | 34265 | 0.2532          | 38.9227 | 17.0561 | 33.8032 | 35.6876   | 19.968  |
| 0.2206        | 23.0  | 35822 | 0.2521          | 39.5234 | 17.6253 | 34.2157 | 36.2645   | 19.962  |
| 0.2198        | 24.0  | 37380 | 0.2532          | 39.6108 | 17.8336 | 34.3222 | 36.3369   | 19.964  |
| 0.2184        | 25.0  | 38937 | 0.2533          | 39.3052 | 17.2967 | 33.9684 | 36.0207   | 19.972  |
| 0.2173        | 26.0  | 40495 | 0.2536          | 39.019  | 17.3083 | 34.0561 | 35.9826   | 19.972  |
| 0.2166        | 27.0  | 42052 | 0.2532          | 39.2553 | 17.6306 | 34.1763 | 36.1479   | 19.974  |
| 0.2159        | 28.0  | 43610 | 0.2539          | 39.3659 | 17.6526 | 34.276  | 36.2856   | 19.972  |
| 0.2154        | 29.0  | 45167 | 0.2543          | 39.3868 | 17.5653 | 34.2637 | 36.2704   | 19.974  |
| 0.2152        | 29.99 | 46710 | 0.2540          | 39.4088 | 17.6509 | 34.241  | 36.3257   | 19.97   |


### Framework versions

- Transformers 4.36.1
- Pytorch 2.1.2
- Datasets 2.19.2
- Tokenizers 0.15.2