Configuration Parsing
Warning:
In adapter_config.json: "peft.task_type" must be a string
Model Details
- Base model:
vinai/PhoWhisper-large
- This model is finetuned on VietMed training set, which reduces the WER on VietMed testset from 26,14 to 21,63.
- To reproduce this finetuned model, you can use the same tokenizer and processor with
vinai/PhoWhisper-large
Model Description
- Finetuned by: Play-With-Mino
- Model type: Whisper
- Language(s) (NLP): Vietnamese
- Finetuned from model: vinai/PhoWhisper-large
How to use
import torch
from transformers import pipeline, AutoModelForSpeechSeq2Seq, AutoProcessor
sampling_rate, audio_array = wavfile.read("path_to_your_wav_file")
audio_input = {
"path" : "pth_to_your_wav_five",
"array" : audio_array,
"sampling_rate" : sampling_rate
}
device = "cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
processor = AutoProcessor.from_pretrained("vinai/PhoWhisper-large")
model = AutoModelForSpeechSeq2Seq.from_pretrained(
"playwithmino/PhoWhisper-large-peft-VietMed",
torch_dtype=torch_dtype,
low_cpu_mem_usage=True
)
model.to(device)
transcriber = pipeline(
"automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
chunk_length_s=30,
torch_dtype=torch_dtype,
device=device,
batch_size=32
)
transcriptions = transcriber(audio_inputs)
Framework versions
- PEFT 0.13.2
- Transformers 4.36.0
- Downloads last month
- 42
Model tree for playwithmino/PhoWhisper-large-peft-VietMed
Base model
vinai/PhoWhisper-large