svilupp commited on
Commit
158957f
1 Parent(s): 385fcc5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -3
README.md CHANGED
@@ -1,3 +1,47 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - microsoft/ms_marco
5
+ language:
6
+ - en
7
+ pipeline_tag: text-classification
8
+ tags:
9
+ - onnx
10
+ - cross-encoder
11
+ ---
12
+
13
+ # Cross-Encoder for MS Marco - ONNX
14
+
15
+ ONNX versions of [Sentence Transformers Cross Encoders](https://huggingface.co/cross-encoder).
16
+
17
+ The models were trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.
18
+
19
+ The models can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)
20
+
21
+ ## Models Available
22
+
23
+ | Model Name | Precision | File Name | File Size |
24
+ |--------------------------------------|-----------|------------------------------------------|-----------|
25
+ | ms-marco-MiniLM-L-4-v2 ONNX | FP32 | ms-marco-MiniLM-L-4-v2-onnx.zip | 70 MB |
26
+ | ms-marco-MiniLM-L-4-v2 ONNX (Quantized) | INT8 | ms-marco-MiniLM-L-4-v2-onnx-int8.zip | 12.8 MB |
27
+ | ms-marco-MiniLM-L-6-v2 ONNX | FP32 | ms-marco-MiniLM-L-6-v2-onnx.zip | 83.4 MB |
28
+ | ms-marco-MiniLM-L-6-v2 ONNX (Quantized) | INT8 | ms-marco-MiniLM-L-6-v2-onnx-int8.zip | 15.2 MB |
29
+
30
+ ## Usage with ONNX Runtime
31
+
32
+ ```python
33
+ import onnxruntime as ort
34
+ from transformers import AutoTokenizer
35
+
36
+ model_path="ms-marco-MiniLM-L-4-v2-onnx/"
37
+ tokenizer = AutoTokenizer.from_pretrained('model_path')
38
+ ort_sess = ort.InferenceSession(model_path + "ms-marco-MiniLM-L-4-v2.onnx")
39
+
40
+ features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="np")
41
+ ort_outs = ort_sess.run(None, features)
42
+ print(ort_outs)
43
+ ```
44
+
45
+ ## Performance
46
+
47
+ TBU...