File size: 6,682 Bytes
3e0c5c8
 
 
 
 
 
 
1b37f0e
3e0c5c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
---
license: apache-2.0
tags:
- generated_from_trainer
metrics:
- bleu
- wer
base_model: google/byt5-base
model-index:
- name: modernisa-v2-byt5-base-lr0.0001
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# modernisa-v2-byt5-base-lr0.0001

This model is a fine-tuned version of [google/byt5-base](https://huggingface.co/google/byt5-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4744
- Bleu: 30.8745
- Wer: 47.8194
- Cer: 34.4895
- Gen Len: 18.5499

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5.0

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Wer     | Cer     | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:-------:|
| 0.2696        | 0.09  | 1000  | 0.3027          | 27.8571 | 49.5134 | 34.4149 | 18.5    |
| 0.2518        | 0.17  | 2000  | 0.2857          | 29.2213 | 49.1981 | 34.6336 | 18.5371 |
| 0.2343        | 0.26  | 3000  | 0.2730          | 29.5067 | 49.117  | 34.9795 | 18.5537 |
| 0.2292        | 0.35  | 4000  | 0.2690          | 29.884  | 48.7025 | 34.8015 | 18.5516 |
| 0.2243        | 0.44  | 5000  | 0.2647          | 29.9577 | 48.8466 | 34.7218 | 18.5477 |
| 0.2112        | 0.52  | 6000  | 0.2636          | 30.3115 | 48.3871 | 34.4895 | 18.5477 |
| 0.2118        | 0.61  | 7000  | 0.2555          | 30.6364 | 48.3961 | 34.7455 | 18.5413 |
| 0.205         | 0.7   | 8000  | 0.2508          | 31.0881 | 47.468  | 34.0759 | 18.5269 |
| 0.2049        | 0.78  | 9000  | 0.2471          | 31.1481 | 47.5942 | 34.4133 | 18.5503 |
| 0.2005        | 0.87  | 10000 | 0.2468          | 30.9375 | 47.6392 | 34.281  | 18.5405 |
| 0.1999        | 0.96  | 11000 | 0.2431          | 30.9692 | 47.7023 | 34.4183 | 18.5405 |
| 0.161         | 1.04  | 12000 | 0.2491          | 31.2337 | 47.3238 | 34.1878 | 18.5298 |
| 0.1601        | 1.13  | 13000 | 0.2496          | 31.4422 | 47.3689 | 34.1657 | 18.5371 |
| 0.1606        | 1.22  | 14000 | 0.2459          | 31.4582 | 47.3329 | 34.2386 | 18.5405 |
| 0.1594        | 1.31  | 15000 | 0.2466          | 31.386  | 47.1166 | 34.2912 | 18.5375 |
| 0.1617        | 1.39  | 16000 | 0.2412          | 31.6546 | 46.8373 | 34.0149 | 18.5294 |
| 0.1582        | 1.48  | 17000 | 0.2461          | 31.2924 | 47.4139 | 34.2573 | 18.5503 |
| 0.1572        | 1.57  | 18000 | 0.2425          | 31.1484 | 47.45   | 34.3675 | 18.5499 |
| 0.1565        | 1.65  | 19000 | 0.2424          | 31.6967 | 46.9724 | 34.1047 | 18.5388 |
| 0.1585        | 1.74  | 20000 | 0.2382          | 31.9026 | 47.0175 | 34.281  | 18.558  |
| 0.1522        | 1.83  | 21000 | 0.2365          | 32.1619 | 46.5219 | 33.9369 | 18.5311 |
| 0.156         | 1.92  | 22000 | 0.2381          | 31.7762 | 46.7922 | 33.9572 | 18.5401 |
| 0.1538        | 2.0   | 23000 | 0.2402          | 31.8785 | 46.8012 | 34.2319 | 18.5516 |
| 0.1083        | 2.09  | 24000 | 0.2654          | 31.9905 | 46.603  | 34.0098 | 18.5384 |
| 0.1086        | 2.18  | 25000 | 0.2618          | 31.6257 | 46.9995 | 34.2607 | 18.5409 |
| 0.1092        | 2.26  | 26000 | 0.2658          | 31.4886 | 47.1436 | 34.337  | 18.5422 |
| 0.1086        | 2.35  | 27000 | 0.2666          | 31.8448 | 46.6751 | 34.1217 | 18.5375 |
| 0.1098        | 2.44  | 28000 | 0.2659          | 31.709  | 46.8913 | 34.1946 | 18.5452 |
| 0.1117        | 2.52  | 29000 | 0.2649          | 31.8114 | 46.8913 | 34.1708 | 18.5431 |
| 0.1094        | 2.61  | 30000 | 0.2656          | 31.6955 | 46.8643 | 34.1606 | 18.5375 |
| 0.1077        | 2.7   | 31000 | 0.2637          | 31.5495 | 46.8823 | 34.0064 | 18.5448 |
| 0.1088        | 2.79  | 32000 | 0.2669          | 32.0837 | 46.612  | 33.9504 | 18.5413 |
| 0.1087        | 2.87  | 33000 | 0.2646          | 31.5549 | 47.0806 | 34.2149 | 18.5286 |
| 0.1077        | 2.96  | 34000 | 0.2630          | 32.1129 | 46.4318 | 33.9403 | 18.5452 |
| 0.0652        | 3.05  | 35000 | 0.3360          | 31.3861 | 47.1977 | 34.1149 | 18.5396 |
| 0.0662        | 3.13  | 36000 | 0.3401          | 31.2372 | 47.3869 | 34.203  | 18.552  |
| 0.0666        | 3.22  | 37000 | 0.3389          | 31.3462 | 47.2968 | 34.1759 | 18.5469 |
| 0.0648        | 3.31  | 38000 | 0.3339          | 30.835  | 47.6753 | 34.381  | 18.552  |
| 0.0654        | 3.4   | 39000 | 0.3395          | 31.0958 | 47.7203 | 34.4692 | 18.5524 |
| 0.0663        | 3.48  | 40000 | 0.3318          | 31.126  | 47.5942 | 34.4539 | 18.5499 |
| 0.0648        | 3.57  | 41000 | 0.3397          | 31.0295 | 47.5852 | 34.3539 | 18.5477 |
| 0.0635        | 3.66  | 42000 | 0.3414          | 31.1287 | 47.5491 | 34.4285 | 18.5494 |
| 0.0656        | 3.74  | 43000 | 0.3394          | 30.9225 | 47.6392 | 34.4285 | 18.5563 |
| 0.0625        | 3.83  | 44000 | 0.3420          | 31.2435 | 47.2968 | 34.1674 | 18.5439 |
| 0.0636        | 3.92  | 45000 | 0.3448          | 31.0688 | 47.6843 | 34.3743 | 18.5439 |
| 0.0586        | 4.0   | 46000 | 0.3675          | 31.2353 | 47.441  | 34.2963 | 18.549  |
| 0.0298        | 4.09  | 47000 | 0.4566          | 30.698  | 47.8555 | 34.4319 | 18.5512 |
| 0.0301        | 4.18  | 48000 | 0.4724          | 30.7773 | 47.8374 | 34.3861 | 18.5507 |
| 0.0311        | 4.27  | 49000 | 0.4640          | 31.0878 | 47.6212 | 34.3861 | 18.5503 |
| 0.03          | 4.35  | 50000 | 0.4654          | 30.8319 | 47.8915 | 34.459  | 18.5529 |
| 0.0302        | 4.44  | 51000 | 0.4665          | 30.9236 | 47.9276 | 34.4997 | 18.552  |
| 0.029         | 4.53  | 52000 | 0.4757          | 30.8307 | 47.9456 | 34.4997 | 18.5482 |
| 0.0301        | 4.61  | 53000 | 0.4672          | 30.7983 | 47.9456 | 34.5218 | 18.5473 |
| 0.0294        | 4.7   | 54000 | 0.4715          | 30.8924 | 47.7564 | 34.4353 | 18.5529 |
| 0.0288        | 4.79  | 55000 | 0.4752          | 30.7372 | 47.7924 | 34.4675 | 18.5524 |
| 0.0289        | 4.88  | 56000 | 0.4744          | 30.8554 | 47.8555 | 34.459  | 18.5516 |
| 0.0288        | 4.96  | 57000 | 0.4744          | 30.8745 | 47.8194 | 34.4895 | 18.5499 |


### Framework versions

- Transformers 4.30.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.12.0
- Tokenizers 0.11.0