Edit model card
Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

Model Details

  • Base model: vinai/PhoWhisper-large
  • This model is finetuned on VietMed training set, which reduces the WER on VietMed testset from 26,14 to 21,63.
  • To reproduce this finetuned model, you can use the same tokenizer and processor with vinai/PhoWhisper-large

Model Description

  • Finetuned by: Play-With-Mino
  • Model type: Whisper
  • Language(s) (NLP): Vietnamese
  • Finetuned from model: vinai/PhoWhisper-large

How to use

import torch
from transformers import pipeline, AutoModelForSpeechSeq2Seq, AutoProcessor

sampling_rate, audio_array = wavfile.read("path_to_your_wav_file")
audio_input = {
    "path" : "pth_to_your_wav_five",
    "array" : audio_array,
    "sampling_rate" : sampling_rate
}
device = "cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
processor = AutoProcessor.from_pretrained("vinai/PhoWhisper-large")
model = AutoModelForSpeechSeq2Seq.from_pretrained(
    "playwithmino/PhoWhisper-large-peft-VietMed", 
    torch_dtype=torch_dtype, 
    low_cpu_mem_usage=True
)
model.to(device)
transcriber = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    chunk_length_s=30,
    torch_dtype=torch_dtype,
    device=device,
    batch_size=32
)
transcriptions = transcriber(audio_inputs)

Framework versions

  • PEFT 0.13.2
  • Transformers 4.36.0
Downloads last month
42
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for playwithmino/PhoWhisper-large-peft-VietMed

Adapter
(2)
this model