nvidia
/

Hymba-1.5B-Base

Text Generation

Model card Files Files and versions Community

pmolchanov commited on Nov 26, 2024

Commit

f8f6a65

·

verified ·

1 Parent(s): ac59e23

Update README.md

Files changed (1) hide show

README.md +9 -8

README.md CHANGED Viewed

@@ -82,7 +82,7 @@ docker run --gpus all -v /home/$USER:/home/$USER -it ghcr.io/tilmto/hymba:v1 bas
 ### Step 2: Chat with Hymba-1.5B-Base
 After setting up the environment, you can use the following script to chat with our Model
-```
 from transformers import LlamaTokenizer, AutoModelForCausalLM, AutoTokenizer, AutoModel
 import torch
@@ -117,11 +117,12 @@ Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.
 ## Citation
 ```
-@article{hymba2024,
-      title={A Hybrid-head Architecture for Small Language Models},
-      author={Xin Dong and Yonggan Fu and Shizhe Diao and Wonmin Byeon and Zijia Chen and Ameya Sunil Mahabaleshwarkar and Shih-Yang Liu and Matthijs Van Keirsbilck and Min-Hung Chen and Yoshi Suhara and Yingyan Celine Lin and Jan Kautz and Pavlo Molchanov},
-      journal={arXiv preprint arXiv:xxxx},
       year={2024},
-      url={https://arxiv.org/abs/xxxx},
-}
-```

 ### Step 2: Chat with Hymba-1.5B-Base
 After setting up the environment, you can use the following script to chat with our Model
+```py
 from transformers import LlamaTokenizer, AutoModelForCausalLM, AutoTokenizer, AutoModel
 import torch
 ## Citation
 ```
+@misc{dong2024hymbahybridheadarchitecturesmall,
+      title={Hymba: A Hybrid-head Architecture for Small Language Models},
+      author={Xin Dong and Yonggan Fu and Shizhe Diao and Wonmin Byeon and Zijia Chen and Ameya Sunil Mahabaleshwarkar and Shih-Yang Liu and Matthijs Van Keirsbilck and Min-Hung Chen and Yoshi Suhara and Yingyan Lin and Jan Kautz and Pavlo Molchanov},
       year={2024},
+      eprint={2411.13676},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2411.13676},
+}