Hello, could you please tell me the embedding generation speed on CPU/GPU in FP16/FP32, and how much does it increase compared to BAAI/bge-m3? Also, is it possible to obtain the model in ONNX format?
In fp16 it's 10 times slower than bge-m3.
ยท Sign up or log in to comment