Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop dataset. It achieves the following results on the evaluation set:

Loss: 1.1121
Bleu: 36.46
Chrf: 55.74
Wer: 58.2620

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.02
training_steps: 10000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
2.6534	0.0138	100	1.43	15.99	2.2446	269.1130
2.4519	0.0276	200	2.13	18.36	2.1941	250.5178
2.2928	0.0414	300	7.14	25.95	2.0086	128.3656
2.233	0.0552	400	5.61	24.25	2.0239	134.0837
2.0406	0.0690	500	5.64	25.65	1.9215	183.8361
2.0273	0.0828	600	13.41	30.96	1.8556	83.7010
1.895	0.0966	700	7.02	26.82	1.8278	158.2170
1.9889	0.1103	800	12.22	31.62	1.7842	99.6398
1.8484	0.1241	900	10.97	30.45	1.7648	91.1751
1.7491	0.1379	1000	10.0	29.42	1.7498	109.0050
1.699	0.1517	1100	12.53	34.87	1.6662	109.9054
1.6959	0.1655	1200	14.54	34.8	1.6287	92.3008
1.6682	0.1793	1300	13.26	33.5	1.5800	103.0617
1.6625	0.1931	1400	19.71	37.33	1.6115	75.9118
1.5462	0.2069	1500	18.3	39.49	1.4993	93.7866
1.3834	0.2207	1600	20.32	40.87	1.4906	79.2436
1.39	0.2345	1700	17.3	38.16	1.4752	93.1562
1.5061	0.2483	1800	20.11	39.69	1.4004	81.0446
1.4125	0.2621	1900	23.82	42.67	1.3854	73.3904
1.3181	0.2759	2000	20.57	40.87	1.3979	78.8384
1.283	0.2897	2100	17.97	40.47	1.3446	88.8789
1.2061	0.3034	2200	25.12	45.42	1.3130	73.5254
1.2091	0.3172	2300	22.12	43.56	1.3274	79.8739
1.1264	0.3310	2400	22.94	45.96	1.2771	78.2080
1.0972	0.3448	2500	24.38	46.04	1.2858	75.4615
1.0822	0.3586	2600	27.39	48.34	1.2376	67.6722
1.0316	0.3724	2700	28.0	47.61	1.2461	68.5277
1.165	0.3862	2800	26.05	48.13	1.1869	71.6794
1.025	0.4	2900	27.14	47.91	1.1716	68.7528
0.8978	0.4138	3000	28.34	49.15	1.1628	65.6461
0.9146	0.4276	3100	25.81	48.42	1.1703	71.7244
0.9764	0.4414	3200	29.63	51.22	1.1526	67.3570
0.9455	0.4552	3300	25.31	49.73	1.1108	72.6249
0.9073	0.4690	3400	27.7	50.85	1.1085	72.7150
0.8596	0.4828	3500	28.34	52.39	1.0927	67.9424
0.8241	0.4966	3600	29.95	51.37	1.1026	65.2859
0.8436	0.5103	3700	27.18	51.45	1.0718	71.2292
0.8318	0.5241	3800	30.71	53.35	1.0678	64.3404
0.8262	0.5379	3900	27.05	51.94	1.0534	71.5894
0.8129	0.5517	4000	27.38	51.97	1.0491	72.1747
0.9036	0.5655	4100	14.43	40.57	1.2250	139.3066
1.0314	0.5793	4200	24.27	46.97	1.2310	75.5966
0.9209	0.5931	4300	23.55	46.04	1.2447	76.4070
0.9204	0.6069	4400	25.87	45.32	1.2891	73.0302
0.9843	0.6207	4500	27.2	46.36	1.2269	71.8145
1.0225	0.6345	4600	26.16	45.72	1.2403	69.6983
0.9773	0.6483	4700	26.37	45.62	1.2464	68.4376
0.9794	0.6621	4800	24.77	47.11	1.2461	72.0846
0.8905	0.6759	4900	24.58	46.35	1.2345	71.2742
0.8305	0.6897	5000	27.28	48.37	1.2239	68.1675
0.9019	0.7034	5100	27.04	50.28	1.1730	70.1486
0.7969	0.7172	5200	26.27	48.07	1.1807	69.0230
0.8036	0.7310	5300	23.04	48.3	1.1632	77.5326
0.8195	0.7448	5400	25.58	50.29	1.1811	76.2269
0.7697	0.7586	5500	23.99	48.91	1.1825	81.4948
0.727	0.7724	5600	23.93	49.23	1.1623	79.5137
0.8002	0.7862	5700	26.29	50.44	1.1503	75.6866
0.6909	0.8	5800	29.27	50.85	1.1338	64.0252
0.7146	0.8138	5900	28.24	50.82	1.1420	66.6367
0.7452	0.8276	6000	31.33	51.92	1.1328	62.4944
0.5989	0.8414	6100	31.1	52.15	1.1455	65.1959
0.6818	0.8552	6200	32.56	52.46	1.1112	62.1342
0.6074	0.8690	6300	33.48	53.32	1.1072	60.6033
0.5942	0.8828	6400	31.39	51.03	1.1462	62.8546
0.6341	0.8966	6500	31.55	52.15	1.1093	62.4043
0.5992	0.9103	6600	33.06	52.52	1.1215	61.4588
0.6156	0.9241	6700	32.38	52.76	1.1031	62.9446
0.6169	0.9379	6800	31.46	52.96	1.1082	64.3404
0.6543	0.9517	6900	33.49	54.02	1.0943	63.1247
0.5017	0.9655	7000	30.95	52.64	1.1141	68.6177
0.5583	0.9793	7100	34.39	54.03	1.1004	61.6839
0.5986	0.9931	7200	33.92	52.85	1.1055	62.4944
0.2443	1.0069	7300	34.86	53.01	1.1442	60.1981
0.254	1.0207	7400	33.92	53.25	1.1458	62.1792
0.2827	1.0345	7500	34.49	53.43	1.1190	60.6484
0.2326	1.0483	7600	35.47	53.53	1.1237	59.2076
0.2017	1.0621	7700	34.65	53.87	1.1179	60.0180
0.2367	1.0759	7800	34.23	53.67	1.1075	60.6484
0.2276	1.0897	7900	34.67	54.51	1.1063	60.3332
0.2087	1.1034	8000	34.44	54.07	1.1090	60.6484
0.2514	1.1172	8100	1.1199	29.85	51.91	69.6083
0.2692	1.1310	8200	1.1642	28.05	51.94	72.1747
0.2784	1.1448	8300	1.1262	27.26	50.77	74.8312
0.2539	1.1586	8400	1.1463	30.7	53.1	65.0158
0.2599	1.1724	8500	1.1255	31.64	53.71	63.2148
0.2419	1.1862	8600	1.1223	33.2	54.15	62.4043
0.2583	1.2	8700	1.1304	33.98	53.65	61.2787
0.239	1.2138	8800	1.1371	34.68	54.35	61.7740
0.2198	1.2276	8900	1.1533	30.65	52.15	72.2647
0.248	1.2414	9000	1.1266	31.98	53.68	65.4210
0.2377	1.2552	9100	1.1510	30.9	53.6	67.9424
0.2183	1.2690	9200	1.1565	30.35	53.04	73.1202
0.1999	1.2828	9300	1.1426	29.48	53.0	74.2909
0.22	1.2966	9400	1.1332	31.93	53.16	66.1414
0.2063	1.3103	9500	1.1144	32.42	53.79	63.3949
0.2054	1.3241	9600	1.1146	33.64	54.69	61.5038
0.2145	1.3379	9700	1.1123	36.68	55.64	57.5867
0.2059	1.3517	9800	1.1102	36.93	56.15	57.5416
0.2001	1.3655	9900	1.1143	36.4	56.09	57.9469
0.1973	1.3793	10000	1.1121	36.46	55.74	58.2620

Framework versions

Transformers 4.41.2
Pytorch 2.2.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

ymoslem
/

whisper-medium-ga2en-v6.2.2-r

Whisper Medium GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-medium-ga2en-v6.2.2-r

Datasets used to train ymoslem/whisper-medium-ga2en-v6.2.2-r

Evaluation results