jinaai
/

jina-colbert-v1-en

passage-retrieval

Inference Endpoints

Model card Files Files and versions Community

nan commited on Feb 16, 2024

Commit

50ad142

·

verified ·

1 Parent(s): 1f60f51

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -138,14 +138,14 @@ Note that both ColBERTv2 and Jina-ColBERT-v1 only employ MSMARCO passage ranking
 We also evaluate the zero-shot performance on datasets where documents have longer context length and compare with some long-context embedding models. Here we use the [LoCo benchmark](https://www.together.ai/blog/long-context-retrieval-models-with-monarch-mixer), which contains 5 datasets with long context length.
-| Model | Avg. NDCG@10 | Model max context length | Used context length |
 | --- | :---: | :---: | :---: |
-| ColBERTv2       | 74.3 | 512 | 512 |
-| Jina-ColBERT-v1 | 75.5 | 8192 | 512 |
-| Jina-ColBERT-v1 | 83.7 | 8192 | 8192* |
-| Jina-embeddings-v2-base-en | 85.4 | 8192 | 8192 |
-\* denotes that we used the context length of 8192 for document but the query length is still 512.
 **To summarize, Jina-ColBERT achieves the comparable performance with ColBERTv2 on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.**

 We also evaluate the zero-shot performance on datasets where documents have longer context length and compare with some long-context embedding models. Here we use the [LoCo benchmark](https://www.together.ai/blog/long-context-retrieval-models-with-monarch-mixer), which contains 5 datasets with long context length.
+| Model | Used context length | Model max context length | Avg. NDCG@10 |
 | --- | :---: | :---: | :---: |
+| ColBERTv2       | 512 | 512 | 74.3 |
+| Jina-ColBERT-v1 (truncated) | 512* | 8192 | 75.5 |
+| Jina-ColBERT-v1 | 8192 | 8192 | 83.7 |
+| Jina-embeddings-v2-base-en | 8192 | 8192 | **85.4** |
+\* denotes that we truncate the context length to the length of 512 for document but the query length is still 512.
 **To summarize, Jina-ColBERT achieves the comparable performance with ColBERTv2 on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.**