Alex-Libryo/BAAI-bge-m3-int8

Converted BAAI/bge-m3 model (dense retriever only) in onnx int8 format for use with Vespa Embedding.

BAAI-bge-m3_quantized.onnx (int8 quantized)

The model was quantized using the optimum toolkit.

Tips: conver to int8 quantized

# https://github.com/vespa-engine/sample-apps/blob/master/simple-semantic-search/export_hf_model_from_hf.py
./export_hf_model_from_hf.py --hf_model BAAI/bge-m3 --output_dir bge-m3

optimum-cli onnxruntime quantize --onnx_model ./bge-m3  -o bge-m3-large_quantized --avx512_vnni

License

The license for this model is based on the original license (found in the LICENSE file in the project's root directory), which is the MIT License.

https://huggingface.co/BAAI/bge-m3

Attribution

All credits for this model go to the authors of BAAI/bge-m3 and the associated researchers and organizations. When using this model, please be sure to attribute the original authors.