Edit model card

Takalani Sesame - Ndebele πŸ‡ΏπŸ‡¦

Model description

Takalani Sesame (named after the South African version of Sesame Street) is a project that aims to promote the use of South African languages in NLP, and in particular look at techniques for low-resource languages to equalise performance with larger languages around the world.

Intended uses & limitations

How to use

from transformers import AutoTokenizer, AutoModelWithLMHead

tokenizer = AutoTokenizer.from_pretrained("jannesg/takalane_nbl_roberta")

model = AutoModelWithLMHead.from_pretrained("jannesg/takalane_nbl_roberta")

Limitations and bias

Updates will be added continously to improve performance. This is a very low resource language, results may be poor at first.

Training data

Data collected from https://wortschatz.uni-leipzig.de/en
Sentences: 318M

Training procedure

No preprocessing. Standard Huggingface hyperparameters.

Author

Jannes Germishuys website

Downloads last month
19
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.