File size: 2,920 Bytes
0d25af7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d96221d
0d25af7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
license: apache-2.0
base_model: google/flan-t5-large
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: t5-flan-t5-xl-fine-tuning-for-translation
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# HeavenlyJoe/flan-t5-large-eng-tgl-translation

This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.3378
- Bleu: 0.4953
- Gen Len: 19.0

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10

### Training results

| Training Loss | Epoch | Step | Validation Loss | Bleu   | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
| 1.9527        | 0.44  | 25   | 1.5761          | 0.2146 | 19.0    |
| 1.8866        | 0.88  | 50   | 1.5303          | 0.293  | 19.0    |
| 1.8045        | 1.32  | 75   | 1.5092          | 0.2499 | 19.0    |
| 1.7596        | 1.75  | 100  | 1.4840          | 0.3498 | 19.0    |
| 1.7354        | 2.19  | 125  | 1.4628          | 0.3282 | 19.0    |
| 1.6866        | 2.63  | 150  | 1.4437          | 0.3205 | 19.0    |
| 1.6605        | 3.07  | 175  | 1.4275          | 0.3781 | 19.0    |
| 1.6157        | 3.51  | 200  | 1.4177          | 0.3805 | 19.0    |
| 1.6237        | 3.95  | 225  | 1.4007          | 0.398  | 19.0    |
| 1.5948        | 4.39  | 250  | 1.3954          | 0.4022 | 19.0    |
| 1.5555        | 4.82  | 275  | 1.3866          | 0.3854 | 19.0    |
| 1.5388        | 5.26  | 300  | 1.3761          | 0.4105 | 19.0    |
| 1.5448        | 5.7   | 325  | 1.3712          | 0.4339 | 19.0    |
| 1.5149        | 6.14  | 350  | 1.3635          | 0.4342 | 19.0    |
| 1.5104        | 6.58  | 375  | 1.3566          | 0.459  | 19.0    |
| 1.4955        | 7.02  | 400  | 1.3525          | 0.4888 | 19.0    |
| 1.467         | 7.46  | 425  | 1.3491          | 0.4723 | 19.0    |
| 1.4872        | 7.89  | 450  | 1.3440          | 0.491  | 19.0    |
| 1.4766        | 8.33  | 475  | 1.3423          | 0.5183 | 19.0    |
| 1.4553        | 8.77  | 500  | 1.3404          | 0.5026 | 19.0    |
| 1.464         | 9.21  | 525  | 1.3384          | 0.4979 | 19.0    |
| 1.454         | 9.65  | 550  | 1.3378          | 0.4953 | 19.0    |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu118
- Datasets 2.15.0
- Tokenizers 0.15.0