BAAI-bge-m3-int8 / README.md
Alex Friedmann
Upload 7 files
7698c0c verified
metadata
license: mit

Converted BAAI/bge-m3 model (dense retriever only) in onnx int8 format for use with Vespa Embedding.

  • BAAI-bge-m3_quantized.onnx (int8 quantized)

The model was quantized using the optimum toolkit.

Tips: conver to int8 quantized

# https://github.com/vespa-engine/sample-apps/blob/master/simple-semantic-search/export_hf_model_from_hf.py
./export_hf_model_from_hf.py --hf_model BAAI/bge-m3 --output_dir bge-m3
optimum-cli onnxruntime quantize --onnx_model ./bge-m3  -o bge-m3-large_quantized --avx512_vnni

License

The license for this model is based on the original license (found in the LICENSE file in the project's root directory), which is the MIT License.

Attribution

All credits for this model go to the authors of BAAI/bge-m3 and the associated researchers and organizations. When using this model, please be sure to attribute the original authors.