jramompichel commited on
Commit
f998837
1 Parent(s): 40015b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md CHANGED
@@ -1,3 +1,63 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+ **Descrición do Modelo**
6
+
7
+ Modelo feito con OpenNMT para o par español-galego utilizando unha arquitectura transformer.
8
+
9
+ **Como utilizar**
10
+
11
+ + Abrir terminal bash
12
+ + Instalar [Python 3.9](https://www.python.org/downloads/release/python-390/)
13
+ + Instalar [Open NMT toolkit v.2.2](https://github.com/OpenNMT/OpenNMT-py)
14
+ + Traducir un input_text utilizando o modelo NOS-MT-es-gl co seguinte comando:
15
+
16
+ ```bash
17
+ onmt_translate -src input_text -model NOS-MT-es-gl -output ./output_file.txt -replace_unk -phrase_table phrase_table-es-gl.txt -gpu 0
18
+ ```
19
+ + O resultado da tradución estará no PATH indicado no flag -output.
20
+
21
+ **Adestramento**
22
+
23
+ Datos utilizados para o adestramento
24
+
25
+ As a data for fine-tuning we used the Softcatalà Catalan-German parallel corpus dataset, with sentences deduplicated and filtered by the GEnCaTa quality filter.
26
+
27
+ Auténticos e Sintéticos (Transliteração)[Colocar Paper]
28
+
29
+ **Procedemento de adestramento**
30
+
31
+ Tokenization
32
+
33
+ The original m2m100_418M model's sentencepiece tokenizer was used.
34
+
35
+ BPE
36
+
37
+ **Hiperparámetros**
38
+
39
+ The model was trained for 2 epochs with the default parameters and LR=2e−5LR = 2\mathrm{e}{-5}LR=2e−5.
40
+
41
+ Colocar o yaml para cada um dos pares
42
+
43
+ **Avaliación**
44
+
45
+
46
+ **Información adicional**
47
+
48
+ Licensing information
49
+
50
+ Apache License, Version 2.0
51
+
52
+ **Financiamento**
53
+
54
+ This work was funded by the Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya within the framework of Projecte AINA.
55
+
56
+ Citation Information
57
+
58
+ @article{garriga2022catalan,
59
+ title={A Catalan-German machine translation system based on the M2M-100 multilingual model},
60
+ author={Garriga Riba, Pol},
61
+ year={2022},
62
+ url={https://repositori.upf.edu/bitstream/handle/10230/54301/GarrigaRiba_2022.pdf?sequence=1&isAllowed=y}
63
+ }