antoinelouis
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -204,17 +204,17 @@ print(similarity)
|
|
204 |
- **MS MARCO**:
|
205 |
We evaluate our model on the small development set of [MS MARCO](https://ir-datasets.com/msmarco-passage.html#msmarco-passage/dev/small), which consists of 6,980 queries for a corpus of 8.8M candidate passages. Below, we compared its performance with other retrieval models on the official metrics for the dataset, i.e., mean reciprocal rank at cut-off 10 (MRR@10).
|
206 |
|
207 |
-
| | model
|
208 |
-
|
209 |
-
| 1 | BM25 ([Pyserini](https://github.com/castorini/pyserini))
|
210 |
-
| 2 | mono-mT5 ([Bonfacio et al., 2021](https://
|
211 |
-
| 3 | mono-mMiniLM ([Bonfacio et al., 2021](https://
|
212 |
-
| 4 | [DPR-X](https://huggingface.co/eugene-yang/dpr-xlmr-large-mtt-neuclir) ([Yang et al., 2022](https://
|
213 |
-
| 5 | [mE5-base](https://huggingface.co/intfloat/multilingual-e5-base) ([Wang et al., 2024](https://
|
214 |
-
| 6 | mColBERT ([Bonfacio et al., 2021](https://
|
215 |
-
| |
|
216 |
-
| 7 | **DPR-XM** (ours)
|
217 |
-
| 8 | [ColBERT-XM](https://huggingface.co/antoinelouis/colbert-xm) (ours)
|
218 |
|
219 |
***
|
220 |
|
|
|
204 |
- **MS MARCO**:
|
205 |
We evaluate our model on the small development set of [MS MARCO](https://ir-datasets.com/msmarco-passage.html#msmarco-passage/dev/small), which consists of 6,980 queries for a corpus of 8.8M candidate passages. Below, we compared its performance with other retrieval models on the official metrics for the dataset, i.e., mean reciprocal rank at cut-off 10 (MRR@10).
|
206 |
|
207 |
+
| | model | Type | #Samples | #Params | en | es | fr | it | pt | id | de | ru | zh | ja | nl | vi | hi | ar | Avg. |
|
208 |
+
|---:|:----------------------------------------------------------------------------------------------------------------------------------------|:--------------|:--------:|:-------:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|
|
209 |
+
| 1 | BM25 ([Pyserini](https://github.com/castorini/pyserini)) | lexical | - | - | 18.4 | 15.8 | 15.5 | 15.3 | 15.2 | 14.9 | 13.6 | 12.4 | 11.6 | 14.1 | 14.0 | 13.6 | 13.4 | 11.1 | 14.2 |
|
210 |
+
| 2 | mono-mT5 ([Bonfacio et al., 2021](https://doi.org/10.48550/arXiv.2108.13897)) | cross-encoder | 12.8M | 390M | 36.6 | 31.4 | 30.2 | 30.3 | 30.2 | 29.8 | 28.9 | 26.3 | 24.9 | 26.7 | 29.2 | 25.6 | 26.6 | 23.5 | 28.6 |
|
211 |
+
| 3 | mono-mMiniLM ([Bonfacio et al., 2021](https://doi.org/10.48550/arXiv.2108.13897)) | cross-encoder | 80.0M | 107M | 36.6 | 30.9 | 29.6 | 29.1 | 28.9 | 29.3 | 27.8 | 25.1 | 24.9 | 26.3 | 27.6 | 24.7 | 26.2 | 21.9 | 27.8 |
|
212 |
+
| 4 | [DPR-X](https://huggingface.co/eugene-yang/dpr-xlmr-large-mtt-neuclir) ([Yang et al., 2022](https://doi.org/10.48550/arXiv.2204.11989)) | single-vector | 25.6M | 550M | 24.5 | 19.6 | 18.9 | 18.3 | 19.0 | 16.9 | 18.2 | 17.7 | 14.8 | 15.4 | 18.5 | 15.1 | 15.4 | 12.9 | 17.5 |
|
213 |
+
| 5 | [mE5-base](https://huggingface.co/intfloat/multilingual-e5-base) ([Wang et al., 2024](https://doi.org/10.48550/arXiv.2402.05672)) | single-vector | 5.1B | 278M | 35.0 | 28.9 | 30.3 | 28.0 | 27.5 | 26.1 | 27.1 | 24.5 | 22.9 | 25.0 | 27.3 | 23.9 | 24.2 | 20.5 | 26.5 |
|
214 |
+
| 6 | mColBERT ([Bonfacio et al., 2021](https://doi.org/10.48550/arXiv.2108.13897)) | multi-vector | 25.6M | 180M | 35.2 | 30.1 | 28.9 | 29.2 | 29.2 | 27.5 | 28.1 | 25.0 | 24.6 | 23.6 | 27.3 | 18.0 | 23.2 | 20.9 | 26.5 |
|
215 |
+
| | | | | | | | | | | | | | | | | | | | |
|
216 |
+
| 7 | **DPR-XM** (ours) | single-vector | 25.6M | 277M | 32.7 | 23.6 | 23.5 | 22.3 | 22.7 | 22.0 | 22.1 | 19.9 | 18.1 | 18.7 | 22.9 | 18.0 | 16.0 | 15.1 | 21.3 |
|
217 |
+
| 8 | [ColBERT-XM](https://huggingface.co/antoinelouis/colbert-xm) (ours) | multi-vector | 6.4M | 277M | 37.2 | 28.5 | 26.9 | 26.5 | 27.6 | 26.3 | 27.0 | 25.1 | 24.6 | 24.1 | 27.5 | 22.6 | 23.8 | 19.5 | 26.2 |
|
218 |
|
219 |
***
|
220 |
|