PeymanHosseini commited on
Commit
2699f75
1 Parent(s): 115e671

Update README.md

Browse files

Updates README (Fixes Grammer Errors)

Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -16,14 +16,14 @@ This version of Hummingbird is only meant to demonstrate Efficient Attention for
16
 
17
  ## Model Details
18
 
19
- The models consists of 1.1 Billion parameters with the following specifications:
20
 
21
  | Parameter | size |
22
- | -------------------- | ---- |
23
  | # Transformer Blocks | 10 |
24
  | Model Dimension | 3072 |
25
  | # Heads | 1 |
26
 
27
 
28
- The Attention Mechanism used is based on our newly proposed Efficient Attention from our paper, *You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism* ([arXiv:2403.01643](https://arxiv.org/abs/2403.01643)). We have chosen the number of heeads to be 1 as an interesting case study, since all current LMs use multiple heads.
29
 
 
16
 
17
  ## Model Details
18
 
19
+ The model consists of 1.1 Billion parameters with the following specifications:
20
 
21
  | Parameter | size |
22
+ | :------------------- | :--- |
23
  | # Transformer Blocks | 10 |
24
  | Model Dimension | 3072 |
25
  | # Heads | 1 |
26
 
27
 
28
+ The Attention Mechanism used is based on our newly proposed Efficient Attention from our paper, *You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism* ([arXiv:2403.01643](https://arxiv.org/abs/2403.01643)). We have chosen the number of heads to be 1 as an interesting case study since all current LMs use multiple heads.
29