license: mit | |
# TinyLlama-NoPE-HeadScale8k | |
## Citation | |
``` | |
@misc{wang2024length, | |
title={Length Generalization of Causal Transformers without Position Encoding}, | |
author={Jie Wang and Tao Ji and Yuanbin Wu and Hang Yan and Tao Gui and Qi Zhang and Xuanjing Huang and Xiaoling Wang}, | |
year={2024}, | |
eprint={2404.12224}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.CL} | |
} | |
``` |