|
--- |
|
license: cc-by-4.0 |
|
language: |
|
- ss |
|
- en |
|
pipeline_tag: text2text-generation |
|
tags: |
|
- m2m100 |
|
- translation |
|
- africanlp |
|
- african |
|
- siswati |
|
--- |
|
|
|
# [ss-en] Siswati to English Translation Model based on M2M100 and The South African Gov-ZA multilingual corpus |
|
|
|
Model created from Siswati to English aligned sentences from [The South African Gov-ZA multilingual corpus](https://github.com/dsfsi/gov-za-multilingual) |
|
|
|
The data set contains cabinet statements from the South African government, maintained by the Government Communication and Information System (GCIS). Data was scraped from the governments website: https://www.gov.za/cabinet-statements |
|
|
|
## Authors |
|
- Vukosi Marivate - [@vukosi](https://twitter.com/vukosi) |
|
- Matimba Shingange |
|
- Richard Lastrucci |
|
- Isheanesu Joseph Dzingirai |
|
- Jenalea Rajab |
|
|
|
## BibTeX entry and citation info |
|
``` |
|
@inproceedings{lastrucci-etal-2023-preparing, |
|
title = "Preparing the Vuk{'}uzenzele and {ZA}-gov-multilingual {S}outh {A}frican multilingual corpora", |
|
author = "Richard Lastrucci and Isheanesu Dzingirai and Jenalea Rajab and Andani Madodonga and Matimba Shingange and Daniel Njini and Vukosi Marivate", |
|
booktitle = "Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023)", |
|
month = may, |
|
year = "2023", |
|
address = "Dubrovnik, Croatia", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://aclanthology.org/2023.rail-1.3", |
|
pages = "18--25" |
|
} |
|
``` |
|
|
|
[Paper - Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora](https://arxiv.org/abs/2303.03750) |