ONNX implementation

#12
by kirankumaram - opened

Can anyone suggest how to use the exported whisper-large model (ONXX version) for transcription or translation?

Maybe it's not exactly what you wanted. But there is an example of audio stream transcribing on Github.
I used the library from Github, for HuggingFace I couldn't find an
example of inference.

I got the following:

import whisper
...
model = whisper.load_model("medium.en")
result = model.transcribe("/path/to/file.mp3", language="en")
...

You can use it with ORT pipeline: https://github.com/huggingface/optimum/pull/420#issue-1406136285

Or ONNX runtime: https://huggingface.co/docs/transformers/serialization#exporting-a-model-to-onnx (here you'll need to modify the template code snippet to pass the appropriate inputs to the ONNX model)

Hi @sanchit-gandhi .

https://github.com/huggingface/optimum/pull/420#issue-1406136285 this ORT pipeline throws an error in the latest version of transformers (4.26.0)

image.png

Hey @kirankumaram - could you open an issue on the optimum repo please with a description of the error and a link to your Colab? https://github.com/huggingface/optimum/issues/new?assignees=&labels=bug&template=bug-report.yml

Sign up or log in to comment