vpj commited on
Commit
838b2f7
·
1 Parent(s): 9f73f7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -18,8 +18,7 @@ by [Georges Hark](https://twitter.com/ghark) and [Varuna Jayasiri](https://twitt
18
  in addition to using relative positions in the attention score calculation by RoPE embeddings,
19
  adds relative positional information explicitly to value embeddings.
20
  Specifically, it incorporates the relative positions of the tokens paid attention to.
21
- RoPER gives better performance in algorithmic tasks.
22
- Results have shown an improvement over RoPE in a language modeling setting on a 3 billion parameter transformer.
23
 
24
  ## Model details
25
 
 
18
  in addition to using relative positions in the attention score calculation by RoPE embeddings,
19
  adds relative positional information explicitly to value embeddings.
20
  Specifically, it incorporates the relative positions of the tokens paid attention to.
21
+ RoPER has given better performance in some algorithmic tasks, and seems comparable to RoPE in language modeling.
 
22
 
23
  ## Model details
24