IndicIRSuite: Multilingual Dataset and Neural Information Models for Indian Languages

Paper link: https://arxiv.org/abs/2312.09508

Dataset link: https://huggingface.co/datasets/saifulhaq9/indicmarco

Model link: https://huggingface.co/saifulhaq9/indiccolbert

Contributors & Acknowledgements

Key Contributors and Team Members: Saiful Haq, Ashutosh Sharma, Pushpak Bhattacharyya

Kindly cite our paper, If you are are using our datasets or models:

@article{haq2023indicirsuite, title={IndicIRSuite: Multilingual Dataset and Neural Information Models for Indian Languages}, author={Haq, Saiful and Sharma, Ashutosh and Bhattacharyya, Pushpak}, journal={arXiv preprint arXiv:2312.09508}, year={2023} }

About

This repository contains Multilingual ColBERT models in 11 Indian Languages.

Language Code to Language Mapping

asm_Beng: Assamese Language

ben_Beng: Bengali Language

guj_Gujr: Gujarati Language

hin_Deva: Hindi Language

kan_Knda: Kannada Language

mal_Mlym: Malyalam Language

mar_Deva: Marathi Language

ory_Orya: Oriya Language

pan_Guru: Punjabi Language

tam_Taml: Tamil Language

tel_Telu: Telugu Language

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.