metadata

library_name: transformers
license: apache-2.0
base_model: google/byt5-small
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: byt5-small-finetuned-yiddish-experiment-11
    results: []

byt5-small-finetuned-yiddish-experiment-11

This model is a fine-tuned version of google/byt5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.1805
Cer: 0.1974
Wer: 0.5776

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 600
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
9.199	1.8868	100	10.8497	0.2853	0.7144
9.1103	3.7736	200	10.3073	0.2701	0.6874
8.5159	5.6604	300	9.1649	0.2579	0.6603
7.6411	7.5472	400	7.9065	0.2445	0.6396
6.8548	9.4340	500	6.6809	0.2340	0.6237
6.1063	11.3208	600	5.4130	0.2272	0.6142
4.7529	13.2075	700	4.1840	0.2224	0.6126
3.7885	15.0943	800	3.1426	0.2183	0.6110
2.9438	16.9811	900	2.1589	0.2141	0.6038
2.1457	18.8679	1000	1.4059	0.2101	0.5951
1.6163	20.7547	1100	1.2903	0.2053	0.5863
1.3877	22.6415	1200	1.2429	0.2024	0.5855
1.3156	24.5283	1300	1.2100	0.1984	0.5784
1.2623	26.4151	1400	1.1897	0.1981	0.5776
1.2381	28.3019	1500	1.1805	0.1974	0.5776

Framework versions

Transformers 4.47.0
Pytorch 2.5.1+cu121
Datasets 2.14.4
Tokenizers 0.21.0