This is a roberta model trained on kubhist2 (https://spraakbanken.gu.se/en/resources/kubhist2, https://spraakbanken.gu.se/blogg/index.php/2019/09/15/the-kubhist-corpus-of-swedish-newspapers/). For a HF version of kubhist2, see here: https://huggingface.co/datasets/ChangeIsKey/kubhist2
This is a work in progress, the quality of the model -- just like the quality of the training data -- is far from great.
Shared here with no guarantee whatsoever, will likely change, use at your own risk, etc.
Discussion of Biases
This is trained on historical data. As such, outdated views might be present in the data.
Other Known Limitations
The data comes from an OCR process. The text is thus not perfect, especially so in the earlier decades.
Contact
Simon Hengchen, iguanodon.ai
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.