|
--- |
|
license: cc |
|
language: |
|
- ve |
|
metrics: |
|
- perplexity |
|
library_name: transformers |
|
tags: |
|
- tshivenda |
|
- south africa |
|
- low-resource |
|
- bantu |
|
- xlm-roberta |
|
widget: |
|
- text: "Rabulasi wa <mask> u khou bvelela nga u lima" |
|
- text: "Vhana vhane vha kha ḓi bva u bebwa vha kha khombo ya u <mask> nga Listeriosis" |
|
--- |
|
|
|
# Zabantu - Tshivenda |
|
|
|
This is a variant of [Zabantu](https://huggingface.co/dsfsi/zabantu-bantu-250m) pre-trained on a monolingual dataset of Tshivenda(ven) sentences on a |
|
transformer network with 120 million traininable parameters. |
|
|
|
|
|
# Usage Example(s) |
|
|
|
```python |
|
from transformers import pipeline |
|
# Initialize the pipeline for masked language model |
|
unmasker = pipeline('fill-mask', model='dsfsi/zabantu-ven-120m') |
|
|
|
sample_sentences = ["Rabulasi wa <mask> u khou bvelela nga u lima", |
|
"Vhana vhane vha kha ḓi bva u bebwa vha kha khombo ya u <mask> nga Listeriosis"] |
|
|
|
# Perform the fill-mask task |
|
results = unmasker(sentence) |
|
# Display the results |
|
for result in results: |
|
print(f"Predicted word: {result['token_str']} - Score: {result['score']}") |
|
print(f"Full sentence: {result['sequence']}\n") |
|
print("=" * 80) |
|
|
|
``` |
|
|