Update README.md
Browse files
README.md
CHANGED
@@ -21,6 +21,8 @@ license_link: https://falconllm.tii.ae/falcon-mamba-7b-terms-and-conditions.html
|
|
21 |
|
22 |
Falcon Mamba 7B - pre-decay checkpoint for continuous pretraining.
|
23 |
|
|
|
|
|
24 |
# TL;DR
|
25 |
|
26 |
# Model Details
|
@@ -237,11 +239,14 @@ Falcon-Mamba-7B was trained on an internal distributed training codebase, Gigatr
|
|
237 |
|
238 |
# Citation
|
239 |
|
240 |
-
*Paper coming soon* 😊. In the meanwhile, you can use the following information to cite:
|
241 |
```
|
242 |
-
@
|
243 |
-
|
244 |
-
|
245 |
-
|
|
|
|
|
|
|
|
|
246 |
}
|
247 |
```
|
|
|
21 |
|
22 |
Falcon Mamba 7B - pre-decay checkpoint for continuous pretraining.
|
23 |
|
24 |
+
Paper link: https://hf.co/papers/2410.05355
|
25 |
+
|
26 |
# TL;DR
|
27 |
|
28 |
# Model Details
|
|
|
239 |
|
240 |
# Citation
|
241 |
|
|
|
242 |
```
|
243 |
+
@misc{zuo2024falconmambacompetitiveattentionfree,
|
244 |
+
title={Falcon Mamba: The First Competitive Attention-free 7B Language Model},
|
245 |
+
author={Jingwei Zuo and Maksim Velikanov and Dhia Eddine Rhaiem and Ilyas Chahed and Younes Belkada and Guillaume Kunsch and Hakim Hacid},
|
246 |
+
year={2024},
|
247 |
+
eprint={2410.05355},
|
248 |
+
archivePrefix={arXiv},
|
249 |
+
primaryClass={cs.CL},
|
250 |
+
url={https://arxiv.org/abs/2410.05355},
|
251 |
}
|
252 |
```
|