Update README.md
Browse files
README.md
CHANGED
@@ -66,7 +66,9 @@ model-index:
|
|
66 |
|
67 |
</div>
|
68 |
|
69 |
-
|
|
|
|
|
70 |
|
71 |
## TLDR
|
72 |
This repository contains the research preview of **LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more**.
|
@@ -84,10 +86,6 @@ LongLLaMA Code is built upon the foundation of [Code Llama](https://huggingface.
|
|
84 |
with three layers used for context extension. **Crucially, LongLLaMA is able to extrapolate much beyond the context length seen in training: 8k. E.g., in the passkey retrieval task, it can handle inputs of length 256k**.
|
85 |
**LongLLaMA Code** is a [Code Llama](https://huggingface.co/codellama/CodeLlama-7b-hf) model finetuned with the FoT method.
|
86 |
|
87 |
-
<p align="center" width="100%">
|
88 |
-
<img src="https://raw.githubusercontent.com/CStanKonrad/long_llama/main/assets/results.png" alt="LongLLaMA" style="width: 70%; min-width: 300px; display: block; margin: auto;">
|
89 |
-
</p>
|
90 |
-
|
91 |
|
92 |
<div align="center">
|
93 |
|
|
|
66 |
|
67 |
</div>
|
68 |
|
69 |
+
<p align="center" width="100%">
|
70 |
+
<img src="https://raw.githubusercontent.com/CStanKonrad/long_llama/main/assets/results.png" alt="LongLLaMA" style="width: 70%; min-width: 300px; display: block; margin: auto;">
|
71 |
+
</p>
|
72 |
|
73 |
## TLDR
|
74 |
This repository contains the research preview of **LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more**.
|
|
|
86 |
with three layers used for context extension. **Crucially, LongLLaMA is able to extrapolate much beyond the context length seen in training: 8k. E.g., in the passkey retrieval task, it can handle inputs of length 256k**.
|
87 |
**LongLLaMA Code** is a [Code Llama](https://huggingface.co/codellama/CodeLlama-7b-hf) model finetuned with the FoT method.
|
88 |
|
|
|
|
|
|
|
|
|
89 |
|
90 |
<div align="center">
|
91 |
|