How are the ONNX files for this model generated?

#21
by bhavikatekwani - opened

👋🏽 Hello!

I'm trying to use nomic-embed-text-v1.5 and was wondering how the ONNX files here were created?

I would like to optimize them for use with TensorRT but I am running into some issues that might be solved by understanding how you export the models.

Thanks for your help.

Nomic AI org

I believe @Xenova converted them, he may be able to share the script. I can answer any questions on errors that you might be seeing though! are you able to post your error logs?

I used Optimum, and you can see how to do it here: https://github.com/huggingface/optimum/pull/1874 (still a WIP PR)

Thank you @zpn and @Xenova !

@zpn there are no errors as of now, just that a simple ONNX to TensorRT conversion doesn't lead to a very performant model as I expected.

@Xenova I'm actually the author of that PR 😄 I was asking about the conversion because I see the inputs and outputs are different in this repo and in what I get via Optimum:

  • model.onnx as you generated has inputs

Screenshot 2024-05-30 at 9.53.47 AM.png

  • This is what I get from Optimum (token_embeddings and sentence_embedding)
    Screenshot 2024-05-30 at 9.54.01 AM.png

Just wanted to make sure as that the PR is still correct.

Nomic AI org

hmm i’ve had mixed results with TensorRT in the past. Are you able to post the onnx/tensorrt graph? i imagine that there may be a lot of unoptimized code

@zpn actually the TensorRT stuff worked out fine. There may be a lot of unoptimized code but possibly something that can be detected with https://github.com/daquexian/onnx-simplifier?

Nomic AI org

Thanks for the resource, I'll take a look! I'm sure there are a lot of unnecessary expensive ops :)

zpn changed discussion status to closed

Just putting it here, another great resource for optimizing ONNX models: https://github.com/tsingmicro-toolchain/OnnxSlim

Sign up or log in to comment