Q4F16 model throws errors

#4
by pdufour - opened
ONNX Community org
edited Nov 28

Hi @Xenova , thanks for this. Did some tests with the q4f16 exports and hitting the same errors I had on my export:

An uncaught WebGPU validation error was raised: Error while parsing WGSL: :51:15 error: return statement type must match its function return type, returned 'f16', expected 'f32'
              return get_xByIndices(aIndices);
              ^^^^^^


 - While validating [ShaderModuleDescriptor ""Conv3DNaive""]
 - While calling [Device].CreateShaderModule([ShaderModuleDescriptor ""Conv3DNaive""]).

Test Instructions:

Actual results

  • See error mentioned above

Expected results

  • Should process query

I worked around this on my onnx export but adding this op to op_block_list https://huggingface.co/pdufour/Qwen2-VL-2B-Instruct-ONNX-Q4-F16/blob/main/Makefile#L126 but ideally this operation should be supported by webgpu.

It's this op here specifically that is causing the error:
image.png

ONNX Community org

Thanks for testing! cc @schmuell since this seems to be a bug with the WebGPU EP. The model runs correctly on CPU.

On that note, would you mind opening a bug report to https://github.com/microsoft/onnxruntime?

Sign up or log in to comment