eugene-yang's picture
commit model
c5b1781
---
license: mit
---
DPR model trained for NeuCLIR based on a XLMR-Large C3-pretrained language model with MTT with MS-MARCO English queries and translated documents in Chinese, Persian, and Russian.
Translation can be found in [neuMARCO](https://ir-datasets.com/neumarco.html) on `ir-datasets`.
Please cite the following papers if you use this model
```bibtex
@inproceedings{sigir2022c3,
author = {Eugene Yang and Suraj Nair and Ramraj Chandradevan and Rebecca Iglesias-Flores and Douglas W. Oard},
title = {C3: Continued Pretraining with Contrastive Weak Supervision for Cross Language Ad-Hoc Retrieval},
booktitle = {Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) (Short Paper)},
year = {2022},
url = {https://arxiv.org/abs/2204.11989}
}
@inproceedings{ecir2023mlir,
title = {Neural Approaches to Multilingual Information Retrieval},
author = {Dawn Lawrie and Eugene Yang and Douglas W Oard and James Mayfield},
booktitle = {Proceedings of the 45th European Conference on Information Retrieval (ECIR)},
year = {2023},
url = {https://arxiv.org/abs/2209.01335}
}
```