Update README.md
Browse files
README.md
CHANGED
@@ -362,8 +362,8 @@ Click the expand button below to see the full list of corpora included in the tr
|
|
362 |
|[MultiUN](https://opus.nlpl.eu/MultiUN/corpus/version/MultiUN) | |fr | |
|
363 |
|[News-Commentary](https://opus.nlpl.eu/News-Commentary/corpus/version/News-Commentary) | |fr | |
|
364 |
|[NLLB](https://opus.nlpl.eu/NLLB/corpus/version/NLLB) |bg,da,el,en,et,fi,fr,gl,hu,it ,lt,lv,pt,ro,sk,sl |bg,cs,da,de,el ,et,fi,fr,hu,it,lt,lv,nl,pl,pt ,ro,sk,sl,sv| bg,cs,cy,da,de,el,et,fi,fr,ga,hr,hu,it,lt,lv,mt,nl,no,oc,pl,pt,ro,ru,sk,sl,sr,sv,uk|
|
365 |
-
|[NÓS Corpus](https://zenodo.org/records/7675110) | | | gl |
|
366 |
-
|[NÓS
|
367 |
|[NTEU](https://www.elrc-share.eu/repository/search/?q=NTEU) | |bg,cs,da,de,el,en,et,fi,fr,ga,hr,hu,it,lt,lv,mt,nl,pl,pt,ro,sk,sl,sv | da,et,ga,hr,lt,lv,mt,ro,sk,sl,sv |
|
368 |
|[OpenSubtitles](https://opus.nlpl.eu/OpenSubtitles/corpus/version/OpenSubtitles) |bg,cs,da,de,el ,et,eu,fi,gl,hr,hu,lt,lv,nl,pl,pt,ro,sk,sl,sv |da,de,fi,fr,hr,hu,it,lv,nl | bg,cs,de,el,et,hr,fi,fr,hr,hu,no,sl,sr|
|
369 |
|[OPUS-100](https://opus.nlpl.eu/opus-100.php) | en | | gl |
|
@@ -470,7 +470,8 @@ Click the expand button below to see the full list of tasks included in the fine
|
|
470 |
| Context-Aware Translation | [TowerBlocks](https://huggingface.co/datasets/Unbabel/TowerBlocks-v0.2): [MT-GenEval](https://github.com/amazon-science/machine-translation-gender-eval) | en-de | 558 |
|
471 |
|**Total** | | | **135,404** |
|
472 |
|
473 |
-
The non-public portion of this dataset was jointly created by the [ILENIA](https://proyectoilenia.es/) partners BSC, HiTZ
|
|
|
474 |
please contact <langtech@bsc.es>.
|
475 |
|
476 |
</details>
|
@@ -498,7 +499,11 @@ please contact <langtech@bsc.es>.
|
|
498 |
|
499 |
## Evaluation
|
500 |
|
501 |
-
Below are the evaluation results on the [Flores+200 devtest set](https://huggingface.co/datasets/openlanguagedata/flores_plus),
|
|
|
|
|
|
|
|
|
502 |
|
503 |
<details>
|
504 |
<summary>Click to show metrics details</summary>
|
@@ -639,7 +644,9 @@ This section presents the evaluation metrics for Basque translation tasks.
|
|
639 |
|
640 |
### Low-Resource Languages of Spain
|
641 |
|
642 |
-
The tables below summarize the performance metrics for English, Spanish, and Catalan to Asturian, Aranese and Aragonese compared
|
|
|
|
|
643 |
|
644 |
<details>
|
645 |
<summary>English evaluation</summary>
|
@@ -674,18 +681,18 @@ The tables below summarize the performance metrics for English, Spanish, and Cat
|
|
674 |
| SalamandraTA-7b-instruct | es | ast | **21.28** | **68.11** | **52.73** |
|
675 |
| SalamandraTA-7b-base | es | ast | 17.65 | 75.78 | 51.05 |
|
676 |
| Transducens/IbRo-nllb | es | ast | 16.79 | 76.36 | 50.89 |
|
677 |
-
|
|
678 |
| nllb-3.3B | es | ast | 11.85 | 100.86 | 40.27 |
|
679 |
| | | | | | |
|
680 |
| SalamandraTA-7b-base | es | arn | **29.19** | **71.85** | **49.42** |
|
681 |
| Transducens/IbRo-nllb | es | arn | 28.45 | 72.56 | 49.28 |
|
682 |
| SalamandraTA-7b-instruct | es | arn | 26.82 | 74.04 | 47.55 |
|
683 |
-
|
|
684 |
| | | | | | |
|
685 |
| Transducens/IbRo-nllb | es | arg | **59.75** | **28.01** | **78.73** |
|
686 |
| SalamandraTA-7b-base | es | arg | 53.96 | 31.51 | 76.08 |
|
687 |
| SalamandraTA-7b-instruct | es | arg | 47.54 | 36.57 | 72.38 |
|
688 |
-
|
|
689 |
|
690 |
</details>
|
691 |
|
@@ -701,19 +708,19 @@ The tables below summarize the performance metrics for English, Spanish, and Cat
|
|
701 |
|:---------------------------------|:---------|:---------|-------:|-------:|-------:|
|
702 |
| SalamandraTA-7b-instruct | ca | ast | **27.86** | **58.19** | 57.98 |
|
703 |
| SalamandraTA-7b-base | ca | ast | 26.11 | 63.63 | **58.08** |
|
704 |
-
|
|
705 |
| Transducens/IbRo-nllb | ca | ast | 24.77 | 61.60 | 57.49 |
|
706 |
| nllb-3.3B | ca | ast | 17.17 | 91.47 | 45.83 |
|
707 |
| | | | | | |
|
708 |
| SalamandraTA-7b-base | ca | arn | **17.77** | **80.88** | **42.12** |
|
709 |
| Transducens/IbRo-nllb | ca | arn | 17.51 | 81.18 | 41.91 |
|
710 |
| SalamandraTA-7b-instruct | ca | arn | 16.45 | 82.01 | 41.04 |
|
711 |
-
|
|
712 |
| | | | | | |
|
713 |
| Transducens/IbRo-nllb | ca | arg | **24.44** | **60.79** | **55.51** |
|
714 |
| SalamandraTA-7b-base | ca | arg | 22.53 | 62.37 | 54.32 |
|
715 |
| SalamandraTA-7b-instruct | ca | arg | 21.62 | 63.38 | 53.01 |
|
716 |
-
|
|
717 |
|
718 |
</details>
|
719 |
|
@@ -725,7 +732,8 @@ With regard to MT models, no specific analysis has yet been carried out in order
|
|
725 |
accuracy across different languages, dialects, or domains. However, we recognize the importance of identifying and addressing any harmful stereotypes,
|
726 |
cultural inaccuracies, or systematic performance discrepancies that may arise in Machine Translation. As such, we plan to perform more analyses as soon
|
727 |
as we have implemented the necessary metrics and methods within our evaluation framework [MT Lens](https://github.com/langtech-bsc/mt-evaluation).
|
728 |
-
Note that the model has only undergone preliminary instruction tuning.
|
|
|
729 |
|
730 |
## Additional information
|
731 |
|
|
|
362 |
|[MultiUN](https://opus.nlpl.eu/MultiUN/corpus/version/MultiUN) | |fr | |
|
363 |
|[News-Commentary](https://opus.nlpl.eu/News-Commentary/corpus/version/News-Commentary) | |fr | |
|
364 |
|[NLLB](https://opus.nlpl.eu/NLLB/corpus/version/NLLB) |bg,da,el,en,et,fi,fr,gl,hu,it ,lt,lv,pt,ro,sk,sl |bg,cs,da,de,el ,et,fi,fr,hu,it,lt,lv,nl,pl,pt ,ro,sk,sl,sv| bg,cs,cy,da,de,el,et,fi,fr,ga,hr,hu,it,lt,lv,mt,nl,no,oc,pl,pt,ro,ru,sk,sl,sr,sv,uk|
|
365 |
+
|[NÓS Authentic Corpus](https://zenodo.org/records/7675110) | | | gl |
|
366 |
+
|[NÓS Synthetic Corpus](https://zenodo.org/records/7685180) | | | gl |
|
367 |
|[NTEU](https://www.elrc-share.eu/repository/search/?q=NTEU) | |bg,cs,da,de,el,en,et,fi,fr,ga,hr,hu,it,lt,lv,mt,nl,pl,pt,ro,sk,sl,sv | da,et,ga,hr,lt,lv,mt,ro,sk,sl,sv |
|
368 |
|[OpenSubtitles](https://opus.nlpl.eu/OpenSubtitles/corpus/version/OpenSubtitles) |bg,cs,da,de,el ,et,eu,fi,gl,hr,hu,lt,lv,nl,pl,pt,ro,sk,sl,sv |da,de,fi,fr,hr,hu,it,lv,nl | bg,cs,de,el,et,hr,fi,fr,hr,hu,no,sl,sr|
|
369 |
|[OPUS-100](https://opus.nlpl.eu/opus-100.php) | en | | gl |
|
|
|
470 |
| Context-Aware Translation | [TowerBlocks](https://huggingface.co/datasets/Unbabel/TowerBlocks-v0.2): [MT-GenEval](https://github.com/amazon-science/machine-translation-gender-eval) | en-de | 558 |
|
471 |
|**Total** | | | **135,404** |
|
472 |
|
473 |
+
The non-public portion of this dataset was jointly created by the [ILENIA](https://proyectoilenia.es/) partners BSC, [HiTZ](http://hitz.ehu.eus/es),
|
474 |
+
and [CiTIUS](https://citius.gal/es/). For further information regarding the instruction-tuning data,
|
475 |
please contact <langtech@bsc.es>.
|
476 |
|
477 |
</details>
|
|
|
499 |
|
500 |
## Evaluation
|
501 |
|
502 |
+
Below are the evaluation results on the [Flores+200 devtest set](https://huggingface.co/datasets/openlanguagedata/flores_plus),
|
503 |
+
compared against the state-of-the-art MADLAD400-7B model ([Kudugunta, S., et al.](https://arxiv.org/abs/2309.04662)) and SalamandraTA-7b-base model.
|
504 |
+
These results cover translation directions between CA-XX, ES-XX, EN-XX, as well as XX-CA, XX-ES, and XX-EN.
|
505 |
+
The metrics have been computed excluding Asturian, Aranese, and Aragonese, as we report them separately.
|
506 |
+
The evaluation was conducted using [MT Lens](https://github.com/langtech-bsc/mt-evaluation) following the standard setting (beam search with beam size 5, limiting the translation length to 500 tokens). We report the following metrics:
|
507 |
|
508 |
<details>
|
509 |
<summary>Click to show metrics details</summary>
|
|
|
644 |
|
645 |
### Low-Resource Languages of Spain
|
646 |
|
647 |
+
The tables below summarize the performance metrics for English, Spanish, and Catalan to Asturian, Aranese and Aragonese compared
|
648 |
+
against [Transducens/IbRo-nllb](https://huggingface.co/Transducens/IbRo-nllb) [(Galiano Jimenez, et al.)](https://aclanthology.org/2024.wmt-1.85/),
|
649 |
+
NLLB-3.3 ([Costa-jussà et al., 2022](https://arxiv.org/abs/2207.04672)) and [SalamandraTA-2B](https://huggingface.co/BSC-LT/salamandraTA-2B).
|
650 |
|
651 |
<details>
|
652 |
<summary>English evaluation</summary>
|
|
|
681 |
| SalamandraTA-7b-instruct | es | ast | **21.28** | **68.11** | **52.73** |
|
682 |
| SalamandraTA-7b-base | es | ast | 17.65 | 75.78 | 51.05 |
|
683 |
| Transducens/IbRo-nllb | es | ast | 16.79 | 76.36 | 50.89 |
|
684 |
+
| SalamandraTA-2B | es | ast | 16.68 | 77.29 | 49.46 |
|
685 |
| nllb-3.3B | es | ast | 11.85 | 100.86 | 40.27 |
|
686 |
| | | | | | |
|
687 |
| SalamandraTA-7b-base | es | arn | **29.19** | **71.85** | **49.42** |
|
688 |
| Transducens/IbRo-nllb | es | arn | 28.45 | 72.56 | 49.28 |
|
689 |
| SalamandraTA-7b-instruct | es | arn | 26.82 | 74.04 | 47.55 |
|
690 |
+
| SalamandraTA-2B | es | arn | 25.41 | 74.71 | 47.33 |
|
691 |
| | | | | | |
|
692 |
| Transducens/IbRo-nllb | es | arg | **59.75** | **28.01** | **78.73** |
|
693 |
| SalamandraTA-7b-base | es | arg | 53.96 | 31.51 | 76.08 |
|
694 |
| SalamandraTA-7b-instruct | es | arg | 47.54 | 36.57 | 72.38 |
|
695 |
+
| SalamandraTA-2B | es | arg | 44.57 | 37.93 | 71.32 |
|
696 |
|
697 |
</details>
|
698 |
|
|
|
708 |
|:---------------------------------|:---------|:---------|-------:|-------:|-------:|
|
709 |
| SalamandraTA-7b-instruct | ca | ast | **27.86** | **58.19** | 57.98 |
|
710 |
| SalamandraTA-7b-base | ca | ast | 26.11 | 63.63 | **58.08** |
|
711 |
+
| SalamandraTA-2B | ca | ast | 25.32 | 62.59 | 55.98 |
|
712 |
| Transducens/IbRo-nllb | ca | ast | 24.77 | 61.60 | 57.49 |
|
713 |
| nllb-3.3B | ca | ast | 17.17 | 91.47 | 45.83 |
|
714 |
| | | | | | |
|
715 |
| SalamandraTA-7b-base | ca | arn | **17.77** | **80.88** | **42.12** |
|
716 |
| Transducens/IbRo-nllb | ca | arn | 17.51 | 81.18 | 41.91 |
|
717 |
| SalamandraTA-7b-instruct | ca | arn | 16.45 | 82.01 | 41.04 |
|
718 |
+
| SalamandraTA-2B | ca | arn | 15.37 | 82.76 | 40.53 |
|
719 |
| | | | | | |
|
720 |
| Transducens/IbRo-nllb | ca | arg | **24.44** | **60.79** | **55.51** |
|
721 |
| SalamandraTA-7b-base | ca | arg | 22.53 | 62.37 | 54.32 |
|
722 |
| SalamandraTA-7b-instruct | ca | arg | 21.62 | 63.38 | 53.01 |
|
723 |
+
| SalamandraTA-2B | ca | arg | 18.6 | 65.82 | 51.21 |
|
724 |
|
725 |
</details>
|
726 |
|
|
|
732 |
accuracy across different languages, dialects, or domains. However, we recognize the importance of identifying and addressing any harmful stereotypes,
|
733 |
cultural inaccuracies, or systematic performance discrepancies that may arise in Machine Translation. As such, we plan to perform more analyses as soon
|
734 |
as we have implemented the necessary metrics and methods within our evaluation framework [MT Lens](https://github.com/langtech-bsc/mt-evaluation).
|
735 |
+
Note that the model has only undergone preliminary instruction tuning.
|
736 |
+
We urge developers to consider potential limitations and conduct safety testing and tuning tailored to their specific applications.
|
737 |
|
738 |
## Additional information
|
739 |
|