Hardware not supported

#2
by NemesisAlm - opened

Hello.

First of all, thank you for this model. It looks amazing to have whisper large v2 in this quantized format.

I am trying to run the code provided on the model card but get an error related to the hardware: 4b quantization not yet supported on this hardware platform!
I also get this message before the error: You are using a model of type whisper to instantiate a model of type . This is not supported for all configurations of models and can yield errors.

Do you have any idea why this error?

This is the code I am using:
from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
from transformers import PretrainedConfig
import os
model_name = 'openai/whisper-large-v2'
model_path = './whisper-large-v2-onnx-int4'
model_config = PretrainedConfig.from_pretrained(model_name)
predictions = []
references = []
sessions = ORTModelForSpeechSeq2Seq.load_model(
os.path.join(model_path, 'encoder_model.onnx'),
os.path.join(model_path, 'decoder_model.onnx'),
os.path.join(model_path, 'decoder_with_past_model.onnx'))

model = ORTModelForSpeechSeq2Seq(sessions[0], sessions[1], model_config, model_path, sessions[2])

Thank you

I am getting the same error, i tried small & large models.
Also tried on amd64 & arm64 (m2)- same error
I did a bit of searching, it looks like there is little hardware support available at this type
image.png

See issue https://github.com/microsoft/onnxruntime/issues/17883

Sign up or log in to comment