Spaces:
Running
Inferencing in Rust
Hey,
just wanted to bring this to your attention if you're interested: https://github.com/huggingface/text-embeddings-inference/issues/468.
In case, are you aware of any way to run your models already directly in Rust?
Hey @do-me , cool, thanks for creating that ticket! I'm not aware of a way to run our models directly in Rust, but I think the speedup would probably be much smaller than in larger models since our biggest bottleneck is the tokenizer. Will follow that ticket though, curious to see what comes out of it!
If I remember correctly, the tokenizer is anyway already written in Rust, right?
Personally, I'm not looking for speed improvements but simply compatibility. I'm developing a tauri2 app based on your multilingual embedding model at the moment (with do-me/foursquare_places_100M) where I'm forced to inference in Rust in the backend.
Indeed, so I think the speedups would be very marginal in our case. Compatibility is interesting though; it's not something on our roadmap at the moment since neither of us knows Rust well enough to add support for it, but it would be awesome if someone is willing to add support for it via the ticket you posted :).
Brief follow-up: I settled with https://github.com/StarlightSearch/EmbedAnything in Rust, a nice wrapper around candle. They don't seem to mention your models anywhere directly. However, considering that they're supporting onnx models and support normal BERT-based models, your static models should work out of the box right? Did not have the time to give it a spin yet, but will try in the next days.
Cool! I did not know of this library yet. If they support onnx it should work, though it depends a bit on what their inputs/outputs are I think. Here's our onnx conversion https://github.com/MinishLab/model2vec/blob/main/scripts/export_to_onnx.py, that might help if there's any issues