Migrate model card from transformers-repo
Browse filesRead announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/allenai/longformer-base-4096/README.md
README.md
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
# longformer-base-4096
|
3 |
+
[Longformer](https://arxiv.org/abs/2004.05150) is a transformer model for long documents.
|
4 |
+
|
5 |
+
`longformer-base-4096` is a BERT-like model started from the RoBERTa checkpoint and pretrained for MLM on long documents. It supports sequences of length up to 4,096.
|
6 |
+
|
7 |
+
Longformer uses a combination of a sliding window (local) attention and global attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations.
|
8 |
+
Please refer to the examples in `modeling_longformer.py` and the paper for more details on how to set global attention.
|
9 |
+
|
10 |
+
|
11 |
+
### Citing
|
12 |
+
|
13 |
+
If you use `Longformer` in your research, please cite [Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150).
|
14 |
+
```
|
15 |
+
@article{Beltagy2020Longformer,
|
16 |
+
title={Longformer: The Long-Document Transformer},
|
17 |
+
author={Iz Beltagy and Matthew E. Peters and Arman Cohan},
|
18 |
+
journal={arXiv:2004.05150},
|
19 |
+
year={2020},
|
20 |
+
}
|
21 |
+
```
|
22 |
+
|
23 |
+
`Longformer` is an open-source project developed by [the Allen Institute for Artificial Intelligence (AI2)](http://www.allenai.org).
|
24 |
+
AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.
|