Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,7 @@
|
|
1 |
---
|
2 |
license: mit
|
|
|
|
|
3 |
---
|
4 |
# CLEX: Continuous Length Extrapolation for Large Language Models
|
5 |
This repo stores the checkpoint of CLEX-Mixtral-8x7B-32K.
|
@@ -21,10 +23,10 @@ If you have any questions, feel free to contact us. (Emails: guanzzh.chen@gmail.
|
|
21 |
|:-----|:-----|:-----------|:-----------|:-----------|:-----------|:------:|
|
22 |
| CLEX-LLaMA-2-7B-16K | base | LLaMA-2-7B | [Redpajama-Book](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) | 16K | 64K | [link](https://huggingface.co/DAMO-NLP-SG/CLEX-7B-16K) |
|
23 |
| CLEX-LLaMA-2-7B-Chat-16K | chat | CLEX-7B-16K | [UltraChat](https://github.com/thunlp/UltraChat) | 16K | 64K | [link](https://huggingface.co/DAMO-NLP-SG/CLEX-7B-Chat-16K) |
|
24 |
-
| CLEX-LLaMA-2-7B-64K | base | LLaMA-2-7B | [Redpajama-Book](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) | 64k | 256K |
|
25 |
-
| CLEX-Phi-2-7B-32K | base | Phi-2-2.7B | [LongCorpus-2.5B](https://huggingface.co/datasets/DAMO-NLP-SG/LongCorpus-2.5B) | 32k | 128K |
|
26 |
-
| CLEX-Mixtral-8x7B-32K | base | Mixtral-8x7B-v0.1 | [LongCorpus-2.5B](https://huggingface.co/datasets/DAMO-NLP-SG/LongCorpus-2.5B) | 32k | >128K |
|
27 |
-
| CLEX-Mixtral-8x7B-Chat-32k | chat | CLEX-Mixtral-8x7B-32K | [Ultrachat 200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) | 32k | >128K |
|
28 |
</div>
|
29 |
|
30 |
|
@@ -72,5 +74,4 @@ If you find our project useful, hope you can star our repo and cite our paper as
|
|
72 |
journal = {arXiv preprint arXiv:2310.16450},
|
73 |
url = {https://arxiv.org/abs/2310.16450}
|
74 |
}
|
75 |
-
```
|
76 |
-
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
datasets:
|
4 |
+
- DAMO-NLP-SG/LongCorpus-2.5B
|
5 |
---
|
6 |
# CLEX: Continuous Length Extrapolation for Large Language Models
|
7 |
This repo stores the checkpoint of CLEX-Mixtral-8x7B-32K.
|
|
|
23 |
|:-----|:-----|:-----------|:-----------|:-----------|:-----------|:------:|
|
24 |
| CLEX-LLaMA-2-7B-16K | base | LLaMA-2-7B | [Redpajama-Book](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) | 16K | 64K | [link](https://huggingface.co/DAMO-NLP-SG/CLEX-7B-16K) |
|
25 |
| CLEX-LLaMA-2-7B-Chat-16K | chat | CLEX-7B-16K | [UltraChat](https://github.com/thunlp/UltraChat) | 16K | 64K | [link](https://huggingface.co/DAMO-NLP-SG/CLEX-7B-Chat-16K) |
|
26 |
+
| CLEX-LLaMA-2-7B-64K | base | LLaMA-2-7B | [Redpajama-Book](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) | 64k | 256K | [link](https://huggingface.co/DAMO-NLP-SG/CLEX-LLaMA-2-7B-64K) |
|
27 |
+
| CLEX-Phi-2-7B-32K | base | Phi-2-2.7B | [LongCorpus-2.5B](https://huggingface.co/datasets/DAMO-NLP-SG/LongCorpus-2.5B) | 32k | 128K | [link](https://huggingface.co/DAMO-NLP-SG/CLEX-Phi-2-2.7B-32K) |
|
28 |
+
| CLEX-Mixtral-8x7B-32K | base | Mixtral-8x7B-v0.1 | [LongCorpus-2.5B](https://huggingface.co/datasets/DAMO-NLP-SG/LongCorpus-2.5B) | 32k | >128K | [link](https://huggingface.co/DAMO-NLP-SG/CLEX-Mixtral-8x7B-32K) |
|
29 |
+
| CLEX-Mixtral-8x7B-Chat-32k | chat | CLEX-Mixtral-8x7B-32K | [Ultrachat 200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) | 32k | >128K | [link](https://huggingface.co/DAMO-NLP-SG/CLEX-Mixtral-8x7B-Chat-32K) |
|
30 |
</div>
|
31 |
|
32 |
|
|
|
74 |
journal = {arXiv preprint arXiv:2310.16450},
|
75 |
url = {https://arxiv.org/abs/2310.16450}
|
76 |
}
|
77 |
+
```
|
|