aneuraz commited on
Commit
a9b9277
1 Parent(s): 051a599

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - de
4
+ - fr
5
+ - en
6
+ - ro
7
+ - zh
8
+ thumbnail:
9
+ tags:
10
+ - sentence alignment
11
+ license: bsd-3-clause
12
+ ---
13
+
14
+ # AWESOME: Aligning Word Embedding Spaces of Multilingual Encoders
15
+
16
+ This model comes from the following GitHub repository: [https://github.com/neulab/awesome-align](https://github.com/neulab/awesome-align)
17
+
18
+ It corresponds to this paper: [https://arxiv.org/abs/2101.08231](https://arxiv.org/abs/2101.08231)
19
+
20
+ Please cite the original paper if you decide to use the model:
21
+
22
+ ```
23
+ @inproceedings{dou2021word,
24
+ title={Word Alignment by Fine-tuning Embeddings on Parallel Corpora},
25
+ author={Dou, Zi-Yi and Neubig, Graham},
26
+ booktitle={Conference of the European Chapter of the Association for Computational Linguistics (EACL)},
27
+ year={2021}
28
+ }
29
+ ```
30
+
31
+
32
+ `awesome-align` is a tool that can extract word alignments from multilingual BERT (mBERT) [Demo](https://colab.research.google.com/drive/1205ubqebM0OsZa1nRgbGJBtitgHqIVv6?usp=sharing) and allows you to fine-tune mBERT on parallel corpora for better alignment quality (see our paper for more details).
33
+