Update README.md
Browse files
README.md
CHANGED
@@ -725,14 +725,58 @@ NLLB-3.3 ([Costa-juss脿 et al., 2022](https://arxiv.org/abs/2207.04672)) and [Sa
|
|
725 |
|
726 |
</details>
|
727 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
728 |
## Ethical Considerations and Limitations
|
729 |
|
730 |
Detailed information on the work done to examine the presence of unwanted social and cognitive biases in the base model can be found
|
731 |
at [Salamandra-7B model card](https://huggingface.co/BSC-LT/salamandra-7b).
|
732 |
-
With regard to MT models,
|
|
|
733 |
accuracy across different languages, dialects, or domains. However, we recognize the importance of identifying and addressing any harmful stereotypes,
|
734 |
-
cultural inaccuracies, or systematic performance discrepancies that may arise in Machine Translation. As such, we plan to
|
735 |
-
as we
|
736 |
Note that the model has only undergone preliminary instruction tuning.
|
737 |
We urge developers to consider potential limitations and conduct safety testing and tuning tailored to their specific applications.
|
738 |
|
@@ -755,9 +799,11 @@ within the framework of [ILENIA Project](https://proyectoilenia.es/) with refere
|
|
755 |
|
756 |
### Acknowledgements
|
757 |
|
758 |
-
The success of this project has been made possible thanks to the invaluable contributions of
|
759 |
-
[HiTZ](http://hitz.ehu.eus/es), and [CiTIUS](https://citius.gal/es/).
|
760 |
Their efforts have been instrumental in advancing our work, and we sincerely appreciate their support.
|
|
|
|
|
|
|
761 |
|
762 |
|
763 |
### Disclaimer
|
|
|
725 |
|
726 |
</details>
|
727 |
|
728 |
+
<details>
|
729 |
+
### Gender Aware Translation
|
730 |
+
|
731 |
+
Below are the evaluation results for gender aware translation evaluated on the [MT-GenEval](https://github.com/amazon-science/machine-translation-gender-eval?tab=readme-ov-file#mt-geneval) dataset ([Currey, A. et al.](https://github.com/amazon-science/machine-translation-gender-eval?tab=readme-ov-file#mt-geneval)).
|
732 |
+
These have been calculated for translation from English into German, Spanish, French, Italian, Portuguese and Russian and are compared against MADLAD400-7B, TowerInstruct-7B-v0.2 and the SalamandraTA-7b-base model.
|
733 |
+
Evaluation was conducted using MT-Lens and is reported as accuracy computed using the accuracy metric provided with MT-GenEval.
|
734 |
+
|
735 |
+
| | Source | Target | Masc | Fem | Pair |
|
736 |
+
|:---------------------------------|:---------|:---------|-------:|-------:|-------:|
|
737 |
+
| SalamandraTA-7b-instruct | en | de | **0.8833333333333333** | **0.8833333333333333** | **0.7733333333333333** |
|
738 |
+
| SalamandraTA-7b-base | en | de | 0.8566666666666667 | 0.77 | 0.66 |
|
739 |
+
| MADLAD400-7B | en | de | 0.8766666666666667 | 0.8233333333333334 | 0.7133333333333334 |
|
740 |
+
| TowerInstruct-7B-v0.2 | en | de | 0.8633333333333333 | 0.84 | 0.7266666666666667 |
|
741 |
+
| | | | | | |
|
742 |
+
| SalamandraTA-7b-instruct | en | es | 0.8666666666666667 | **0.85** | **0.7366666666666667** |
|
743 |
+
| SalamandraTA-7b-base | en | es | **0.89** | 0.7333333333333333 | 0.6433333333333333 |
|
744 |
+
| MADLAD400-7B | en | es | 0.8866666666666667 | 0.78 | 0.6866666666666666 |
|
745 |
+
| TowerInstruct-7B-v0.2 | en | es | 0.85 | 0.8233333333333334 | 0.6933333333333334 |
|
746 |
+
| | | | | | |
|
747 |
+
| SalamandraTA-7b-instruct | en | fr | **0.9** | 0.82 | **0.7366666666666667** |
|
748 |
+
| SalamandraTA-7b-base | en | fr | 0.8866666666666667 | 0.71 | 0.6166666666666667 |
|
749 |
+
| MADLAD400-7B | en | fr | 0.8733333333333333 | 0.7766666666666666 | 0.6633333333333333 |
|
750 |
+
| TowerInstruct-7B-v0.2 | en | fr | 0.88 | **0.8233333333333334** | 0.7166666666666667 |
|
751 |
+
| | | | | | |
|
752 |
+
| SalamandraTA-7b-instruct | en | it | 0.9 | **0.7633333333333333** | 0.6833333333333333 |
|
753 |
+
| SalamandraTA-7b-base | en | it | 0.8933333333333333 | 0.5933333333333334 | 0.5133333333333333 |
|
754 |
+
| MADLAD400-7B | en | it | 0.9066666666666666 | 0.6633333333333333 | 0.5966666666666667 |
|
755 |
+
| TowerInstruct-7B-v0.2 | en | it | **0.9466666666666667** | 0.7466666666666667 | **0.7133333333333334** |
|
756 |
+
| | | | | | |
|
757 |
+
| SalamandraTA-7b-instruct | en | pt | 0.92 | **0.77** | **0.7066666666666667** |
|
758 |
+
| SalamandraTA-7b-base | en | pt | **0.9233333333333333** | 0.65 | 0.5966666666666667 |
|
759 |
+
| MADLAD400-7B | en | pt | **0.9233333333333333** | 0.6866666666666666 | 0.6266666666666667 |
|
760 |
+
| TowerInstruct-7B-v0.2 | en | pt | 0.9066666666666666 | 0.73 | 0.67 |
|
761 |
+
| | | | | | |
|
762 |
+
| SalamandraTA-7b-instruct | en | ru | **0.95** | **0.8366666666666667** | **0.7933333333333333** |
|
763 |
+
| SalamandraTA-7b-base | en | ru | 0.9333333333333333 | 0.7133333333333334 | 0.6533333333333333 |
|
764 |
+
| MADLAD400-7B | en | ru | 0.94 | 0.7966666666666666 | 0.74 |
|
765 |
+
| TowerInstruct-7B-v0.2 | en | ru | 0.9333333333333333 | 0.7966666666666666 | 0.7333333333333333 |
|
766 |
+
|
767 |
+
<img src="./images/geneval.png"/>
|
768 |
+
|
769 |
+
</details>
|
770 |
+
|
771 |
## Ethical Considerations and Limitations
|
772 |
|
773 |
Detailed information on the work done to examine the presence of unwanted social and cognitive biases in the base model can be found
|
774 |
at [Salamandra-7B model card](https://huggingface.co/BSC-LT/salamandra-7b).
|
775 |
+
With regard to MT models, the only analysis related to bias which we have conducted is the MT-GenEval evaluation.
|
776 |
+
No specific analysis has yet been carried out in order to evaluate potential biases or limitations in translation
|
777 |
accuracy across different languages, dialects, or domains. However, we recognize the importance of identifying and addressing any harmful stereotypes,
|
778 |
+
cultural inaccuracies, or systematic performance discrepancies that may arise in Machine Translation. As such, we plan to continue performing more analyses as
|
779 |
+
as we implement the necessary metrics and methods within our evaluation framework [MT Lens](https://github.com/langtech-bsc/mt-evaluation).
|
780 |
Note that the model has only undergone preliminary instruction tuning.
|
781 |
We urge developers to consider potential limitations and conduct safety testing and tuning tailored to their specific applications.
|
782 |
|
|
|
799 |
|
800 |
### Acknowledgements
|
801 |
|
802 |
+
The success of this project has been made possible thanks to the invaluable contributions of numerous research centers, teams, and projects that provided access to their data.
|
|
|
803 |
Their efforts have been instrumental in advancing our work, and we sincerely appreciate their support.
|
804 |
+
We would like to thank, among others:
|
805 |
+
[CENID](https://cenid.es/), [CiTIUS](https://citius.gal/es/), [Gaitu proiektua](https://gaitu.eus/), [Helsinki NLP](https://github.com/Helsinki-NLP), [HiTZ](http://hitz.ehu.eus/es), [Institut d鈥橢studis Aranesi](http://www.institutestudisaranesi.cat/), [MaCoCu Project](https://macocu.eu/), [Machine Translate Foundation](https://machinetranslate.org/about), [NTEU Project](https://nteu.eu/), [Orai NLP technologies](https://huggingface.co/orai-nlp), [Proxecto N贸s](https://nos.gal/es/proxecto-nos), [Softcatal脿](https://www.softcatala.org/), [Tatoeba Project](https://tatoeba.org/), [TILDE Project](https://tilde.ai/tildelm/), [Transducens - Departament de Llenguatges i Sistemes Inform脿tics Universitat d鈥橝lacant](https://transducens.dlsi.ua.es/), [Unbabel](https://huggingface.co/Unbabel).
|
806 |
+
|
807 |
|
808 |
|
809 |
### Disclaimer
|