Sinhala News Media Identification
This is a text classification task created with the NSINA dataset. This dataset is also released with the same license as NSINA.
Data
Data can be loaded into pandas dataframes using the following code.
from datasets import Dataset
from datasets import load_dataset
train = Dataset.to_pandas(load_dataset('sinhala-nlp/NSINA-Media', split='train'))
test = Dataset.to_pandas(load_dataset('sinhala-nlp/NSINA-Media', split='test'))
Citation
If you are using the dataset or the models, please cite the following paper.
@inproceedings{Nsina2024,
author={Hettiarachchi, Hansi and Premasiri, Damith and Uyangodage, Lasitha and Ranasinghe, Tharindu},
title={{NSINA: A News Corpus for Sinhala}},
booktitle={The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
year={2024},
month={May},
}
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.