jramompichel commited on
Commit
bd4ec79
1 Parent(s): a02febf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -21
README.md CHANGED
@@ -1,15 +1,21 @@
1
  ---
2
  license: mit
3
- ---
 
 
 
 
 
 
4
  ---
5
  license: mit
6
  ---
7
 
8
- **Descrición do Modelo**
9
 
10
- Modelo feito con OpenNMT para o par español-galego utilizando unha arquitectura transformer.
11
 
12
- **Como utilizar**
13
 
14
  + Abrir terminal bash
15
  + Instalar [Python 3.9](https://www.python.org/downloads/release/python-390/)
@@ -17,17 +23,17 @@ Modelo feito con OpenNMT para o par español-galego utilizando unha arquitectura
17
  + Traducir un input_text utilizando o modelo NOS-MT-en-gl co seguinte comando:
18
 
19
  ```bash
20
- onmt_translate -src input_text -model NOS-MT-es-gl -output ./output_file.txt -replace_unk -phrase_table phrase_table-es-gl.txt -gpu 0
21
  ```
22
  + O resultado da tradución estará no PATH indicado no flag -output.
23
 
24
- **Adestramento**
25
 
26
  Datos utilizados para o adestramento
27
 
28
  Auténticos e Sintéticos (Transliteração)[Colocar Paper]
29
 
30
- **Procedemento de adestramento**
31
 
32
  + Tokenization dos datasets feita co tokenizador de linguakit https://github.com/citiususc/Linguakit
33
 
@@ -40,11 +46,11 @@ onmt_build_vocab -config bpe-en-gl_emb.yaml -n_sample 100000
40
  onmt_train -config bpe-en-gl_emb.yaml
41
  ```
42
 
43
- **Hiperparámetros**
44
 
45
  Os parámetros usados para o desenvolvimento do modelo poden ser consultados directamente no mesmo ficheiro .yaml bpe-en-gl_emb.yaml
46
 
47
- **Avaliación**
48
  A avalación dos modelos é feita cunha mistura de tests desenvolvidos internamente
49
  (gold1, gold2, test-suite) con outros datasets disponíbeis en galego (Flores).
50
 
@@ -54,11 +60,29 @@ A avalación dos modelos é feita cunha mistura de tests desenvolvidos intername
54
 
55
 
56
 
57
- **Información adicional**
 
 
 
 
58
 
59
- Licensing information
 
 
 
 
 
60
 
61
- Apache License, Version 2.0
 
 
 
 
 
 
 
 
 
62
 
63
  **Financiamento / Funding**
64
 
@@ -66,12 +90,4 @@ Esta investigación foi financiada polo proxecto "Nós: o galego na sociedade e
66
 
67
  This research was funded by the project "Nós: Galician in the society and economy of artificial intelligence", agreement between Xunta de Galicia and University of Santiago de Compostela, and grant ED431G2019/04 by the Galician Ministry of Education, University and Professional Training, and the European Regional Development Fund (ERDF/FEDER program), and Groups of Reference: ED431C 2020/21.
68
 
69
-
70
- **Citation Information**
71
-
72
- @article{,
73
- title={},
74
- author={},
75
- year={2022},
76
- url={}
77
- }
 
1
  ---
2
  license: mit
3
+ language:
4
+ - gl
5
+ metrics:
6
+ - bleu (Gold1): 36.8
7
+ - bleu (Gold2): 47.1
8
+ - bleu (Flores): 32.3
9
+ - bleu (Test-suite): 42.7
10
  ---
11
  license: mit
12
  ---
13
 
14
+ **Descrición do Modelo / Model Description**
15
 
16
+ Modelo feito con OpenNMT para o par inglés-galego utilizando unha arquitectura transformer.
17
 
18
+ **Como traducir / How to translate**
19
 
20
  + Abrir terminal bash
21
  + Instalar [Python 3.9](https://www.python.org/downloads/release/python-390/)
 
23
  + Traducir un input_text utilizando o modelo NOS-MT-en-gl co seguinte comando:
24
 
25
  ```bash
26
+ onmt_translate -src input_text -model NOS-MT-en-gl -output ./output_file.txt -replace_unk -phrase_table phrase_table-en-gl.txt -gpu 0
27
  ```
28
  + O resultado da tradución estará no PATH indicado no flag -output.
29
 
30
+ **Adestramento / Training**
31
 
32
  Datos utilizados para o adestramento
33
 
34
  Auténticos e Sintéticos (Transliteração)[Colocar Paper]
35
 
36
+ **Procedemento de adestramento / Training process**
37
 
38
  + Tokenization dos datasets feita co tokenizador de linguakit https://github.com/citiususc/Linguakit
39
 
 
46
  onmt_train -config bpe-en-gl_emb.yaml
47
  ```
48
 
49
+ **Hiperparámetros / Hyper-parameters**
50
 
51
  Os parámetros usados para o desenvolvimento do modelo poden ser consultados directamente no mesmo ficheiro .yaml bpe-en-gl_emb.yaml
52
 
53
+ **Avaliación / Evaluation**
54
  A avalación dos modelos é feita cunha mistura de tests desenvolvidos internamente
55
  (gold1, gold2, test-suite) con outros datasets disponíbeis en galego (Flores).
56
 
 
60
 
61
 
62
 
63
+ **Licenzas do Modelo / Licensing information**
64
+
65
+ MIT License
66
+
67
+ Copyright (c) 2023 Proxecto Nós
68
 
69
+ Permission is hereby granted, free of charge, to any person obtaining a copy
70
+ of this software and associated documentation files (the "Software"), to deal
71
+ in the Software without restriction, including without limitation the rights
72
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
73
+ copies of the Software, and to permit persons to whom the Software is
74
+ furnished to do so, subject to the following conditions:
75
 
76
+ The above copyright notice and this permission notice shall be included in all
77
+ copies or substantial portions of the Software.
78
+
79
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
80
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
81
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
82
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
83
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
84
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
85
+ SOFTWARE.
86
 
87
  **Financiamento / Funding**
88
 
 
90
 
91
  This research was funded by the project "Nós: Galician in the society and economy of artificial intelligence", agreement between Xunta de Galicia and University of Santiago de Compostela, and grant ED431G2019/04 by the Galician Ministry of Education, University and Professional Training, and the European Regional Development Fund (ERDF/FEDER program), and Groups of Reference: ED431C 2020/21.
92
 
93
+ **Citation Information**