ColBERTer (Dim: 32) for Passage Retrieval

If you want to know more about our ColBERTer architecture check out our paper: https://arxiv.org/abs/2203.13088 🎉

For more information, source code, and a minimal usage example please visit: https://github.com/sebastian-hofstaetter/colberter

Limitations & Bias

  • The model is only trained on english text.

  • The model inherits social biases from both DistilBERT and MSMARCO.

  • The model is only trained on relatively short passages of MSMARCO (avg. 60 words length), so it might struggle with longer text.

Citation

If you use our model checkpoint please cite our work as:

@article{Hofstaetter2022_colberter,
 author = {Sebastian Hofst{\"a}tter and Omar Khattab and Sophia Althammer and Mete Sertkan and Allan Hanbury},
 title = {Introducing Neural Bag of Whole-Words with ColBERTer: Contextualized Late Interactions using Enhanced Reduction},
 publisher = {arXiv},
 url = {https://arxiv.org/abs/2203.13088},
 doi = {10.48550/ARXIV.2203.13088},
 year = {2022},
}
Downloads last month
37
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Dataset used to train sebastian-hofstaetter/colberter-128-32-msmarco