whisper-large-v3-myanmar

This model is a fine-tuned version of openai/whisper-large-v3 on the chuuhtetnaing/myanmar-speech-dataset-openslr-80 dataset. It achieves the following results on the evaluation set:

Loss: 0.1752
Wer: 54.8976

Usage

from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-large-v3-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # ကျမ ပြည်ပ မှာ ပညာသင် တော့ စာမေးပွဲ ကို တပတ်တခါ စစ်တယ်

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 20
eval_batch_size: 20
seed: 42
gradient_accumulation_steps: 3
total_train_batch_size: 60
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 200
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.9771	1.0	42	0.7598	100.0
0.3477	2.0	84	0.2140	89.8931
0.2244	3.0	126	0.1816	79.0294
0.1287	4.0	168	0.1510	71.9947
0.1029	5.0	210	0.1575	77.8718
0.0797	6.0	252	0.1315	70.5254
0.0511	7.0	294	0.1143	70.5699
0.03	8.0	336	0.1154	68.1656
0.0211	9.0	378	0.1289	69.1897
0.0151	10.0	420	0.1318	66.7854
0.0113	11.0	462	0.1478	69.1451
0.0079	12.0	504	0.1484	66.2066
0.0053	13.0	546	0.1389	65.0935
0.0031	14.0	588	0.1479	64.3811
0.0014	15.0	630	0.1611	64.8264
0.001	16.0	672	0.1627	63.3571
0.0012	17.0	714	0.1546	65.0045
0.0006	18.0	756	0.1566	64.5147
0.0006	20.0	760	0.1581	64.6928
0.0002	21.0	798	0.1621	63.9804
0.0003	22.0	836	0.1664	60.8638
0.0002	23.0	874	0.1663	58.5040
0.0	24.0	912	0.1699	55.8326
0.0	25.0	950	0.1715	55.0312
0.0	26.0	988	0.1730	54.9866
0.0	27.0	1026	0.1740	54.8976
0.0	28.0	1064	0.1747	54.8976
0.0	29.0	1102	0.1751	54.8976
0.0	30.0	1140	0.1752	54.8976

Framework versions

Transformers 4.35.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.15.1

chuuhtetnaing
/

whisper-large-v3-myanmar

whisper-large-v3-myanmar

Usage

Training hyperparameters

Training results

Framework versions

Model tree for chuuhtetnaing/whisper-large-v3-myanmar

Dataset used to train chuuhtetnaing/whisper-large-v3-myanmar

Space using chuuhtetnaing/whisper-large-v3-myanmar 1

Collection including chuuhtetnaing/whisper-large-v3-myanmar

Whisper Myanmar

Evaluation results