Update README.md
Browse files
README.md
CHANGED
@@ -44,9 +44,9 @@ pipeline_tag: sentence-similarity
|
|
44 |
- [With Sentence Transformers:](#with-sentence-transformers)
|
45 |
- [With Huggingface Transformers:](#with-huggingface-transformers)
|
46 |
- [FAQs](#faqs)
|
47 |
-
- [How can
|
48 |
-
- [How do I
|
49 |
-
- [How do I offer hybrid search to
|
50 |
- [Why not run MTEB?](#why-not-run-mteb)
|
51 |
- [Roadmap](#roadmap)
|
52 |
- [Notes on Reproducing:](#notes-on-reproducing)
|
@@ -143,13 +143,13 @@ for query, query_embedding in zip(queries, query_embeddings):
|
|
143 |
|
144 |
# FAQS
|
145 |
|
146 |
-
#### How can
|
147 |
- You can use ONNX flavours of these models via [FlashRetrieve](https://github.com/PrithivirajDamodaran/FlashRetrieve) library.
|
148 |
|
149 |
-
#### How do I
|
150 |
[Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
|
151 |
|
152 |
-
|
153 |
MIRACL paper shows simply combining BM25 is a good starting point for a Hybrid option:
|
154 |
The below numbers are with mDPR model, but miniMiracle_te_v1 should give a even better hybrid performance.
|
155 |
|
|
|
44 |
- [With Sentence Transformers:](#with-sentence-transformers)
|
45 |
- [With Huggingface Transformers:](#with-huggingface-transformers)
|
46 |
- [FAQs](#faqs)
|
47 |
+
- [How can I reduce overall inference cost ?](#how-can-i-reduce-overall-inference-cost)
|
48 |
+
- [How do I reduce vector storage cost?](#how-do-i-reduce-vector-storage-cost)
|
49 |
+
- [How do I offer hybrid search to improve accuracy?](#how-do-i-offer-hybrid-search-to-improve-accuracy)
|
50 |
- [Why not run MTEB?](#why-not-run-mteb)
|
51 |
- [Roadmap](#roadmap)
|
52 |
- [Notes on Reproducing:](#notes-on-reproducing)
|
|
|
143 |
|
144 |
# FAQS
|
145 |
|
146 |
+
#### How can I reduce overall inference cost ?
|
147 |
- You can use ONNX flavours of these models via [FlashRetrieve](https://github.com/PrithivirajDamodaran/FlashRetrieve) library.
|
148 |
|
149 |
+
#### How do I reduce vector storage cost ?
|
150 |
[Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
|
151 |
|
152 |
+
#### How do I offer hybrid search to improve accuracy ?
|
153 |
MIRACL paper shows simply combining BM25 is a good starting point for a Hybrid option:
|
154 |
The below numbers are with mDPR model, but miniMiracle_te_v1 should give a even better hybrid performance.
|
155 |
|