Update README.md
Browse files
README.md
ADDED
@@ -0,0 +1,212 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: gmw
|
3 |
+
tags:
|
4 |
+
- translation
|
5 |
+
|
6 |
+
license: apache-2.0
|
7 |
+
---
|
8 |
+
|
9 |
+
### gmw-gmw
|
10 |
+
|
11 |
+
* source languages: gmw
|
12 |
+
* target languages: gmw
|
13 |
+
* OPUS readme: [gmw-gmw](https://github.com/Helsinki-NLP/Tatoeba-Challenge/tree/master/models/gmw-gmw/README.md)
|
14 |
+
|
15 |
+
* dataset: opus
|
16 |
+
* model: transformer
|
17 |
+
* source language(s): afr ang_Latn deu eng enm_Latn frr fry gos gsw ksh ltz nds nld pdc sco stq swg yid
|
18 |
+
* target language(s): afr ang_Latn deu eng enm_Latn frr fry gos gsw ksh ltz nds nld pdc sco stq swg yid
|
19 |
+
* model: transformer
|
20 |
+
* pre-processing: normalization + SentencePiece (spm32k,spm32k)
|
21 |
+
* a sentence initial language token is required in the form of `>>id<<` (id = valid target language ID)
|
22 |
+
* download original weights: [opus-2020-07-27.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/gmw-gmw/opus-2020-07-27.zip)
|
23 |
+
* test set translations: [opus-2020-07-27.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/gmw-gmw/opus-2020-07-27.test.txt)
|
24 |
+
* test set scores: [opus-2020-07-27.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/gmw-gmw/opus-2020-07-27.eval.txt)
|
25 |
+
|
26 |
+
## Benchmarks
|
27 |
+
|
28 |
+
| testset | BLEU | chr-F |
|
29 |
+
|-----------------------|-------|-------|
|
30 |
+
| newssyscomb2009-deueng.deu.eng | 25.3 | 0.527 |
|
31 |
+
| newssyscomb2009-engdeu.eng.deu | 19.0 | 0.502 |
|
32 |
+
| news-test2008-deueng.deu.eng | 23.7 | 0.515 |
|
33 |
+
| news-test2008-engdeu.eng.deu | 19.2 | 0.491 |
|
34 |
+
| newstest2009-deueng.deu.eng | 23.1 | 0.514 |
|
35 |
+
| newstest2009-engdeu.eng.deu | 18.6 | 0.495 |
|
36 |
+
| newstest2010-deueng.deu.eng | 25.8 | 0.545 |
|
37 |
+
| newstest2010-engdeu.eng.deu | 20.3 | 0.505 |
|
38 |
+
| newstest2011-deueng.deu.eng | 23.7 | 0.523 |
|
39 |
+
| newstest2011-engdeu.eng.deu | 18.9 | 0.490 |
|
40 |
+
| newstest2012-deueng.deu.eng | 24.4 | 0.529 |
|
41 |
+
| newstest2012-engdeu.eng.deu | 19.2 | 0.489 |
|
42 |
+
| newstest2013-deueng.deu.eng | 27.2 | 0.545 |
|
43 |
+
| newstest2013-engdeu.eng.deu | 22.4 | 0.514 |
|
44 |
+
| newstest2014-deen-deueng.deu.eng | 27.0 | 0.546 |
|
45 |
+
| newstest2015-ende-deueng.deu.eng | 28.4 | 0.552 |
|
46 |
+
| newstest2015-ende-engdeu.eng.deu | 25.3 | 0.541 |
|
47 |
+
| newstest2016-ende-deueng.deu.eng | 33.2 | 0.595 |
|
48 |
+
| newstest2016-ende-engdeu.eng.deu | 29.8 | 0.578 |
|
49 |
+
| newstest2017-ende-deueng.deu.eng | 29.0 | 0.557 |
|
50 |
+
| newstest2017-ende-engdeu.eng.deu | 23.9 | 0.534 |
|
51 |
+
| newstest2018-ende-deueng.deu.eng | 35.9 | 0.607 |
|
52 |
+
| newstest2018-ende-engdeu.eng.deu | 34.8 | 0.609 |
|
53 |
+
| newstest2019-deen-deueng.deu.eng | 32.1 | 0.579 |
|
54 |
+
| newstest2019-ende-engdeu.eng.deu | 31.0 | 0.579 |
|
55 |
+
| Tatoeba-test.afr-ang.afr.ang | 0.0 | 0.065 |
|
56 |
+
| Tatoeba-test.afr-deu.afr.deu | 46.8 | 0.668 |
|
57 |
+
| Tatoeba-test.afr-eng.afr.eng | 58.5 | 0.728 |
|
58 |
+
| Tatoeba-test.afr-enm.afr.enm | 13.4 | 0.357 |
|
59 |
+
| Tatoeba-test.afr-fry.afr.fry | 5.3 | 0.026 |
|
60 |
+
| Tatoeba-test.afr-gos.afr.gos | 3.5 | 0.228 |
|
61 |
+
| Tatoeba-test.afr-ltz.afr.ltz | 1.6 | 0.131 |
|
62 |
+
| Tatoeba-test.afr-nld.afr.nld | 55.4 | 0.715 |
|
63 |
+
| Tatoeba-test.afr-yid.afr.yid | 3.4 | 0.008 |
|
64 |
+
| Tatoeba-test.ang-afr.ang.afr | 3.1 | 0.096 |
|
65 |
+
| Tatoeba-test.ang-deu.ang.deu | 2.6 | 0.188 |
|
66 |
+
| Tatoeba-test.ang-eng.ang.eng | 5.4 | 0.211 |
|
67 |
+
| Tatoeba-test.ang-enm.ang.enm | 1.7 | 0.197 |
|
68 |
+
| Tatoeba-test.ang-gos.ang.gos | 6.6 | 0.186 |
|
69 |
+
| Tatoeba-test.ang-ltz.ang.ltz | 5.3 | 0.072 |
|
70 |
+
| Tatoeba-test.ang-yid.ang.yid | 0.9 | 0.131 |
|
71 |
+
| Tatoeba-test.deu-afr.deu.afr | 52.7 | 0.699 |
|
72 |
+
| Tatoeba-test.deu-ang.deu.ang | 0.8 | 0.133 |
|
73 |
+
| Tatoeba-test.deu-eng.deu.eng | 43.5 | 0.621 |
|
74 |
+
| Tatoeba-test.deu-enm.deu.enm | 6.9 | 0.245 |
|
75 |
+
| Tatoeba-test.deu-frr.deu.frr | 0.8 | 0.200 |
|
76 |
+
| Tatoeba-test.deu-fry.deu.fry | 15.1 | 0.367 |
|
77 |
+
| Tatoeba-test.deu-gos.deu.gos | 2.2 | 0.279 |
|
78 |
+
| Tatoeba-test.deu-gsw.deu.gsw | 1.0 | 0.176 |
|
79 |
+
| Tatoeba-test.deu-ksh.deu.ksh | 0.6 | 0.208 |
|
80 |
+
| Tatoeba-test.deu-ltz.deu.ltz | 12.1 | 0.274 |
|
81 |
+
| Tatoeba-test.deu-nds.deu.nds | 18.8 | 0.446 |
|
82 |
+
| Tatoeba-test.deu-nld.deu.nld | 48.6 | 0.669 |
|
83 |
+
| Tatoeba-test.deu-pdc.deu.pdc | 4.6 | 0.198 |
|
84 |
+
| Tatoeba-test.deu-sco.deu.sco | 12.0 | 0.340 |
|
85 |
+
| Tatoeba-test.deu-stq.deu.stq | 3.2 | 0.240 |
|
86 |
+
| Tatoeba-test.deu-swg.deu.swg | 0.5 | 0.179 |
|
87 |
+
| Tatoeba-test.deu-yid.deu.yid | 1.7 | 0.160 |
|
88 |
+
| Tatoeba-test.eng-afr.eng.afr | 55.8 | 0.730 |
|
89 |
+
| Tatoeba-test.eng-ang.eng.ang | 5.7 | 0.157 |
|
90 |
+
| Tatoeba-test.eng-deu.eng.deu | 36.7 | 0.584 |
|
91 |
+
| Tatoeba-test.eng-enm.eng.enm | 2.0 | 0.272 |
|
92 |
+
| Tatoeba-test.eng-frr.eng.frr | 6.1 | 0.246 |
|
93 |
+
| Tatoeba-test.eng-fry.eng.fry | 15.3 | 0.378 |
|
94 |
+
| Tatoeba-test.eng-gos.eng.gos | 1.2 | 0.242 |
|
95 |
+
| Tatoeba-test.eng-gsw.eng.gsw | 0.9 | 0.164 |
|
96 |
+
| Tatoeba-test.eng-ksh.eng.ksh | 0.9 | 0.170 |
|
97 |
+
| Tatoeba-test.eng-ltz.eng.ltz | 13.7 | 0.263 |
|
98 |
+
| Tatoeba-test.eng-nds.eng.nds | 17.1 | 0.410 |
|
99 |
+
| Tatoeba-test.eng-nld.eng.nld | 49.6 | 0.673 |
|
100 |
+
| Tatoeba-test.eng-pdc.eng.pdc | 5.1 | 0.218 |
|
101 |
+
| Tatoeba-test.eng-sco.eng.sco | 34.8 | 0.587 |
|
102 |
+
| Tatoeba-test.eng-stq.eng.stq | 2.1 | 0.322 |
|
103 |
+
| Tatoeba-test.eng-swg.eng.swg | 1.7 | 0.192 |
|
104 |
+
| Tatoeba-test.eng-yid.eng.yid | 1.7 | 0.173 |
|
105 |
+
| Tatoeba-test.enm-afr.enm.afr | 13.4 | 0.397 |
|
106 |
+
| Tatoeba-test.enm-ang.enm.ang | 0.7 | 0.063 |
|
107 |
+
| Tatoeba-test.enm-deu.enm.deu | 41.5 | 0.514 |
|
108 |
+
| Tatoeba-test.enm-eng.enm.eng | 21.3 | 0.483 |
|
109 |
+
| Tatoeba-test.enm-fry.enm.fry | 0.0 | 0.058 |
|
110 |
+
| Tatoeba-test.enm-gos.enm.gos | 10.7 | 0.354 |
|
111 |
+
| Tatoeba-test.enm-ksh.enm.ksh | 7.0 | 0.161 |
|
112 |
+
| Tatoeba-test.enm-nds.enm.nds | 18.6 | 0.316 |
|
113 |
+
| Tatoeba-test.enm-nld.enm.nld | 38.3 | 0.524 |
|
114 |
+
| Tatoeba-test.enm-yid.enm.yid | 0.7 | 0.128 |
|
115 |
+
| Tatoeba-test.frr-deu.frr.deu | 4.1 | 0.219 |
|
116 |
+
| Tatoeba-test.frr-eng.frr.eng | 14.1 | 0.186 |
|
117 |
+
| Tatoeba-test.frr-fry.frr.fry | 3.1 | 0.129 |
|
118 |
+
| Tatoeba-test.frr-gos.frr.gos | 3.6 | 0.226 |
|
119 |
+
| Tatoeba-test.frr-nds.frr.nds | 12.4 | 0.145 |
|
120 |
+
| Tatoeba-test.frr-nld.frr.nld | 9.8 | 0.209 |
|
121 |
+
| Tatoeba-test.frr-stq.frr.stq | 2.8 | 0.142 |
|
122 |
+
| Tatoeba-test.fry-afr.fry.afr | 0.0 | 1.000 |
|
123 |
+
| Tatoeba-test.fry-deu.fry.deu | 30.1 | 0.535 |
|
124 |
+
| Tatoeba-test.fry-eng.fry.eng | 28.0 | 0.486 |
|
125 |
+
| Tatoeba-test.fry-enm.fry.enm | 16.0 | 0.262 |
|
126 |
+
| Tatoeba-test.fry-frr.fry.frr | 5.5 | 0.160 |
|
127 |
+
| Tatoeba-test.fry-gos.fry.gos | 1.6 | 0.307 |
|
128 |
+
| Tatoeba-test.fry-ltz.fry.ltz | 30.4 | 0.438 |
|
129 |
+
| Tatoeba-test.fry-nds.fry.nds | 8.1 | 0.083 |
|
130 |
+
| Tatoeba-test.fry-nld.fry.nld | 41.4 | 0.616 |
|
131 |
+
| Tatoeba-test.fry-stq.fry.stq | 1.6 | 0.217 |
|
132 |
+
| Tatoeba-test.fry-yid.fry.yid | 1.6 | 0.159 |
|
133 |
+
| Tatoeba-test.gos-afr.gos.afr | 6.3 | 0.318 |
|
134 |
+
| Tatoeba-test.gos-ang.gos.ang | 6.2 | 0.058 |
|
135 |
+
| Tatoeba-test.gos-deu.gos.deu | 11.7 | 0.363 |
|
136 |
+
| Tatoeba-test.gos-eng.gos.eng | 14.9 | 0.322 |
|
137 |
+
| Tatoeba-test.gos-enm.gos.enm | 9.1 | 0.398 |
|
138 |
+
| Tatoeba-test.gos-frr.gos.frr | 3.3 | 0.117 |
|
139 |
+
| Tatoeba-test.gos-fry.gos.fry | 13.1 | 0.387 |
|
140 |
+
| Tatoeba-test.gos-ltz.gos.ltz | 3.1 | 0.154 |
|
141 |
+
| Tatoeba-test.gos-nds.gos.nds | 2.4 | 0.206 |
|
142 |
+
| Tatoeba-test.gos-nld.gos.nld | 13.9 | 0.395 |
|
143 |
+
| Tatoeba-test.gos-stq.gos.stq | 2.1 | 0.209 |
|
144 |
+
| Tatoeba-test.gos-yid.gos.yid | 1.7 | 0.147 |
|
145 |
+
| Tatoeba-test.gsw-deu.gsw.deu | 10.5 | 0.350 |
|
146 |
+
| Tatoeba-test.gsw-eng.gsw.eng | 10.7 | 0.299 |
|
147 |
+
| Tatoeba-test.ksh-deu.ksh.deu | 12.0 | 0.373 |
|
148 |
+
| Tatoeba-test.ksh-eng.ksh.eng | 3.2 | 0.225 |
|
149 |
+
| Tatoeba-test.ksh-enm.ksh.enm | 13.4 | 0.308 |
|
150 |
+
| Tatoeba-test.ltz-afr.ltz.afr | 37.4 | 0.525 |
|
151 |
+
| Tatoeba-test.ltz-ang.ltz.ang | 2.8 | 0.036 |
|
152 |
+
| Tatoeba-test.ltz-deu.ltz.deu | 40.3 | 0.596 |
|
153 |
+
| Tatoeba-test.ltz-eng.ltz.eng | 31.7 | 0.490 |
|
154 |
+
| Tatoeba-test.ltz-fry.ltz.fry | 36.3 | 0.658 |
|
155 |
+
| Tatoeba-test.ltz-gos.ltz.gos | 2.9 | 0.209 |
|
156 |
+
| Tatoeba-test.ltz-nld.ltz.nld | 38.8 | 0.530 |
|
157 |
+
| Tatoeba-test.ltz-stq.ltz.stq | 5.8 | 0.165 |
|
158 |
+
| Tatoeba-test.ltz-yid.ltz.yid | 1.0 | 0.159 |
|
159 |
+
| Tatoeba-test.multi.multi | 36.4 | 0.568 |
|
160 |
+
| Tatoeba-test.nds-deu.nds.deu | 35.0 | 0.573 |
|
161 |
+
| Tatoeba-test.nds-eng.nds.eng | 29.6 | 0.495 |
|
162 |
+
| Tatoeba-test.nds-enm.nds.enm | 3.7 | 0.194 |
|
163 |
+
| Tatoeba-test.nds-frr.nds.frr | 6.6 | 0.133 |
|
164 |
+
| Tatoeba-test.nds-fry.nds.fry | 4.2 | 0.087 |
|
165 |
+
| Tatoeba-test.nds-gos.nds.gos | 2.0 | 0.243 |
|
166 |
+
| Tatoeba-test.nds-nld.nds.nld | 41.4 | 0.618 |
|
167 |
+
| Tatoeba-test.nds-swg.nds.swg | 0.6 | 0.178 |
|
168 |
+
| Tatoeba-test.nds-yid.nds.yid | 8.3 | 0.238 |
|
169 |
+
| Tatoeba-test.nld-afr.nld.afr | 59.4 | 0.759 |
|
170 |
+
| Tatoeba-test.nld-deu.nld.deu | 49.9 | 0.685 |
|
171 |
+
| Tatoeba-test.nld-eng.nld.eng | 54.1 | 0.699 |
|
172 |
+
| Tatoeba-test.nld-enm.nld.enm | 5.0 | 0.250 |
|
173 |
+
| Tatoeba-test.nld-frr.nld.frr | 2.4 | 0.224 |
|
174 |
+
| Tatoeba-test.nld-fry.nld.fry | 19.4 | 0.446 |
|
175 |
+
| Tatoeba-test.nld-gos.nld.gos | 2.5 | 0.273 |
|
176 |
+
| Tatoeba-test.nld-ltz.nld.ltz | 13.8 | 0.292 |
|
177 |
+
| Tatoeba-test.nld-nds.nld.nds | 21.3 | 0.457 |
|
178 |
+
| Tatoeba-test.nld-sco.nld.sco | 14.7 | 0.423 |
|
179 |
+
| Tatoeba-test.nld-stq.nld.stq | 1.9 | 0.257 |
|
180 |
+
| Tatoeba-test.nld-swg.nld.swg | 4.2 | 0.162 |
|
181 |
+
| Tatoeba-test.nld-yid.nld.yid | 2.6 | 0.186 |
|
182 |
+
| Tatoeba-test.pdc-deu.pdc.deu | 39.7 | 0.529 |
|
183 |
+
| Tatoeba-test.pdc-eng.pdc.eng | 25.0 | 0.427 |
|
184 |
+
| Tatoeba-test.sco-deu.sco.deu | 28.4 | 0.428 |
|
185 |
+
| Tatoeba-test.sco-eng.sco.eng | 41.8 | 0.595 |
|
186 |
+
| Tatoeba-test.sco-nld.sco.nld | 36.4 | 0.565 |
|
187 |
+
| Tatoeba-test.stq-deu.stq.deu | 7.7 | 0.328 |
|
188 |
+
| Tatoeba-test.stq-eng.stq.eng | 21.1 | 0.428 |
|
189 |
+
| Tatoeba-test.stq-frr.stq.frr | 2.0 | 0.118 |
|
190 |
+
| Tatoeba-test.stq-fry.stq.fry | 6.3 | 0.255 |
|
191 |
+
| Tatoeba-test.stq-gos.stq.gos | 1.4 | 0.244 |
|
192 |
+
| Tatoeba-test.stq-ltz.stq.ltz | 4.4 | 0.204 |
|
193 |
+
| Tatoeba-test.stq-nld.stq.nld | 10.7 | 0.371 |
|
194 |
+
| Tatoeba-test.stq-yid.stq.yid | 1.4 | 0.105 |
|
195 |
+
| Tatoeba-test.swg-deu.swg.deu | 9.5 | 0.343 |
|
196 |
+
| Tatoeba-test.swg-eng.swg.eng | 15.1 | 0.306 |
|
197 |
+
| Tatoeba-test.swg-nds.swg.nds | 0.7 | 0.196 |
|
198 |
+
| Tatoeba-test.swg-nld.swg.nld | 11.6 | 0.308 |
|
199 |
+
| Tatoeba-test.swg-yid.swg.yid | 0.9 | 0.186 |
|
200 |
+
| Tatoeba-test.yid-afr.yid.afr | 100.0 | 1.000 |
|
201 |
+
| Tatoeba-test.yid-ang.yid.ang | 0.6 | 0.079 |
|
202 |
+
| Tatoeba-test.yid-deu.yid.deu | 16.7 | 0.372 |
|
203 |
+
| Tatoeba-test.yid-eng.yid.eng | 15.8 | 0.344 |
|
204 |
+
| Tatoeba-test.yid-enm.yid.enm | 1.3 | 0.166 |
|
205 |
+
| Tatoeba-test.yid-fry.yid.fry | 5.6 | 0.157 |
|
206 |
+
| Tatoeba-test.yid-gos.yid.gos | 2.2 | 0.160 |
|
207 |
+
| Tatoeba-test.yid-ltz.yid.ltz | 2.1 | 0.238 |
|
208 |
+
| Tatoeba-test.yid-nds.yid.nds | 14.4 | 0.365 |
|
209 |
+
| Tatoeba-test.yid-nld.yid.nld | 20.9 | 0.397 |
|
210 |
+
| Tatoeba-test.yid-stq.yid.stq | 3.7 | 0.165 |
|
211 |
+
| Tatoeba-test.yid-swg.yid.swg | 1.8 | 0.156 |
|
212 |
+
|