return_timestamps error
When using the pipeline to get transcription with timestamps, it's alright for some audio files, but for some of the files it returns the error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-8cc132230b9b> in <module>
----> 1 prediction = pipe(dataset[0], return_timestamps=True)["chunks"]
4 frames
/usr/local/lib/python3.8/dist-packages/transformers/pipelines/automatic_speech_recognition.py in _find_timestamp_sequence(sequences, tokenizer, feature_extractor, max_source_positions)
104 sequence = sequence.squeeze(0)
105 # get rid of the `forced_decoder_idx` that are use to parametrize the generation
--> 106 begin_idx = np.where(sequence == timestamp_begin)[0].item() if timestamp_begin in sequence else 0
107 sequence = sequence[begin_idx:]
108
ValueError: can only convert an array of size 1 to a Python scalar
Below is the code to use the pipeline.
device = "cuda:0" if torch.cuda.is_available() else "cpu"
pipe = pipeline(
"automatic-speech-recognition",
model="openai/whisper-tiny",
chunk_length_s=30,
device=device,
)
filename = files[71][0]
mypath = '/content/drive/MyDrive/twitch_data/audios/prediction/'
audio, _ = librosa.load(mypath+ filename, sr = 16000)
my_dict = {"raw": np.array(audio), 'sampling_rate': np.array(16000)}
prediction = pipe(my_dict, return_timestamps=True)["chunks"]
I'm not sure if this is a bug, or if there's something wrong with the files. Any help is appreciated!
Hey @pearlyu ! Thanks for flagging this and sorry for getting back to you so late. Are you able to reproduce this bug using an audio file we have access to on our end? Either you can share the audio file you get the error with, or try using an audio sample from a HF dataset:
from datasets import load_dataset
librispeech = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = librispeech[0]["audio"]
prediction = pipe(sample, return_timestamps=True)["chunks"]
We'd need an audio file that breaks the pipeline in order to investigate what's going on!
Hi
@sanchit-gandhi
, the piece of code that you shared throws the following error:
ValueError: We cannot return_timestamps yet on non-ctc models !
Could you update transformers
to the latest version please?
pip install --upgrade transformers