vasista22/whisper-tamil-large-v2 · IndexError: index -2 is out of bounds for dimension 0 with size 0

Dec 30, 2024

For all the models while transcribing we are often getting the below error

/usr/local/lib/python3.10/dist-packages/transformers/models/whisper/generation_whisper.py in _prepare_decoder_input_ids(cur_bsz, init_tokens, current_segments, batch_idx_map, do_condition_on_prev_tokens, prompt_ids, generation_config, config, device, suppress_tokens, timestamp_begin, kwargs)
   1687         prev_start_of_text = getattr(generation_config, "prev_sot_token_id", None)
   1688         if prev_start_of_text is None:
-> 1689             prev_start_of_text = suppress_tokens[-2] if suppress_tokens is not None else None
   1690 
   1691         if any(do_condition_on_prev_tokens) and len(current_segments[0]) > 0:

IndexError: index -2 is out of bounds for dimension 0 with size 0

Are there any fix for this
Input file used - .wav
running on T4 GPU

khp12345

Feb 11

hey there! i am getting the same error. did you come across any fixes?

lewington

21 days ago

•

edited 20 days ago

This is a working version of the how to run code which fixes the error

#%%
import torch
from transformers import pipeline

#%%
# path to the audio file to be transcribed
audio = "data/subtitle-samples/tamil/jai-bahim-cut.wav"
device = "cuda:0" if torch.cuda.is_available() else "cpu"

transcribe = pipeline(task="automatic-speech-recognition", model="vasista22/whisper-tamil-large-v2", chunk_length_s=30, device=device)
transcribe.model.config.forced_decoder_ids = transcribe.tokenizer.get_decoder_prompt_ids(language="ta", task="transcribe")

#%%

transcribe.generation_config.suppress_tokens = None # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< the fix

#%%
print('Transcription: ', transcribe(audio)["text"])

# %%

initially transcribe.model.config.suppress_tokens is assigned to an empty array, hence the error.