svilupp
/

onnx-cross-encoders

Text Classification

Model card Files Files and versions Community

svilupp commited on 30 days ago

Commit

158957f

•

1 Parent(s): 385fcc5

Update README.md

Files changed (1) hide show

README.md +47 -3

README.md CHANGED Viewed

@@ -1,3 +1,47 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- microsoft/ms_marco
+language:
+- en
+pipeline_tag: text-classification
+tags:
+- onnx
+- cross-encoder
+---
+# Cross-Encoder for MS Marco - ONNX
+ONNX versions of [Sentence Transformers Cross Encoders](https://huggingface.co/cross-encoder).
+The models were trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.
+The models can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)
+## Models Available
+| Model Name                           | Precision | File Name                                | File Size |
+|--------------------------------------|-----------|------------------------------------------|-----------|
+| ms-marco-MiniLM-L-4-v2 ONNX          | FP32      | ms-marco-MiniLM-L-4-v2-onnx.zip          | 70 MB     |
+| ms-marco-MiniLM-L-4-v2 ONNX (Quantized) | INT8    | ms-marco-MiniLM-L-4-v2-onnx-int8.zip     | 12.8 MB   |
+| ms-marco-MiniLM-L-6-v2 ONNX          | FP32      | ms-marco-MiniLM-L-6-v2-onnx.zip          | 83.4 MB   |
+| ms-marco-MiniLM-L-6-v2 ONNX (Quantized) | INT8    | ms-marco-MiniLM-L-6-v2-onnx-int8.zip     | 15.2 MB   |
+## Usage with ONNX Runtime
+```python
+import onnxruntime as ort
+from transformers import AutoTokenizer
+model_path="ms-marco-MiniLM-L-4-v2-onnx/"
+tokenizer = AutoTokenizer.from_pretrained('model_path')
+ort_sess = ort.InferenceSession(model_path + "ms-marco-MiniLM-L-4-v2.onnx")
+features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'],  padding=True, truncation=True, return_tensors="np")
+ort_outs = ort_sess.run(None, features)
+print(ort_outs)
+```
+## Performance
+TBU...