|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
Pretrained models for our paper (https://arxiv.org/abs/2210.08431) |
|
```bibtex |
|
@inproceedings{wu-etal-2022-modeling, |
|
title = "Modeling Context With Linear Attention for Scalable Document-Level Translation", |
|
author = "Zhaofeng Wu and Hao Peng and Nikolaos Pappas and Noah A. Smith", |
|
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022", |
|
month = dec, |
|
year = "2022", |
|
publisher = "Association for Computational Linguistics", |
|
} |
|
``` |
|
|
|
Please see the "Files and versions" tab for the models. You can find our IWSLT models and our OpenSubtitles models that are early-stopped based on BLEU and consistency scores, respectively. The `c` part in the checkpoint name refers to the number of context sentences used; it is the same as the sliding window size (the `L` in our paper) minus one. |
|
|