PeymanHosseini commited on
Commit
07e64f4
1 Parent(s): 2699f75

Update README.md

Browse files

Adds citation info

Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -27,3 +27,14 @@ The model consists of 1.1 Billion parameters with the following specifications:
27
 
28
  The Attention Mechanism used is based on our newly proposed Efficient Attention from our paper, *You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism* ([arXiv:2403.01643](https://arxiv.org/abs/2403.01643)). We have chosen the number of heads to be 1 as an interesting case study since all current LMs use multiple heads.
29
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  The Attention Mechanism used is based on our newly proposed Efficient Attention from our paper, *You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism* ([arXiv:2403.01643](https://arxiv.org/abs/2403.01643)). We have chosen the number of heads to be 1 as an interesting case study since all current LMs use multiple heads.
29
 
30
+ If you use Efficient Attention or Hummingbird, please cite our paper:
31
+
32
+ ```
33
+ @article{Hosseinis24BetterAttention,
34
+ title = {You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism},
35
+ author = {Hosseini, Mehran and Hosseini, Peyman},
36
+ journal = {arXiv preprint arXiv:2403.01643},
37
+ year = {2024}
38
+ }
39
+ ```
40
+