THUDM
/

LongReward-glm4-9b-DPO

Text Generation

Model card Files Files and versions Community

NeoZ123 commited on Oct 29, 2024

Commit

5cd4f53

·

verified ·

1 Parent(s): 96f64b0

Update README_zh.md

Files changed (1) hide show

README_zh.md +2 -2

README_zh.md CHANGED Viewed

@@ -3,7 +3,7 @@
 Read this in [English](README.md)
 <p align="center">
-  🤗 <a href="https://huggingface.co/datasets/THUDM/LongReward-10k" target="_blank">[LongReward Dataset] </a> • 💻 <a href="https://github.com/THUDM/LongReward" target="_blank">[Github Repo]</a> • 📃 <a href="https://arxiv.org/abs/" target="_blank">[LongReward Paper]</a>
 </p>
 LongReward-glm4-9b-DPO 是 [LongReward-glm4-9b-SFT](https://huggingface.co/THUDM/LongReward-glm4-9b-SFT) 的 DPO 版本，支持最多
@@ -66,7 +66,7 @@ print(tokenizer.decode(out[0][input_len:], skip_special_tokens=True))
   title = {LongReward: Improving Long-context Large Language Models
 with AI Feedback}
   author={Jiajie Zhang and Zhongni Hou and Xin Lv and Shulin Cao and Zhenyu Hou and Yilin Niu and Lei Hou and Lei Hou and Yuxiao Dong and Ling Feng and Juanzi Li},
-  journal={arXiv preprint arXiv:},
   year={2024}
 }
 ```

 Read this in [English](README.md)
 <p align="center">
+  🤗 <a href="https://huggingface.co/datasets/THUDM/LongReward-10k" target="_blank">[LongReward Dataset] </a> • 💻 <a href="https://github.com/THUDM/LongReward" target="_blank">[Github Repo]</a> • 📃 <a href="https://arxiv.org/abs/2410.21252" target="_blank">[LongReward Paper]</a>
 </p>
 LongReward-glm4-9b-DPO 是 [LongReward-glm4-9b-SFT](https://huggingface.co/THUDM/LongReward-glm4-9b-SFT) 的 DPO 版本，支持最多
   title = {LongReward: Improving Long-context Large Language Models
 with AI Feedback}
   author={Jiajie Zhang and Zhongni Hou and Xin Lv and Shulin Cao and Zhenyu Hou and Yilin Niu and Lei Hou and Lei Hou and Yuxiao Dong and Ling Feng and Juanzi Li},
+  journal={arXiv preprint arXiv:2410.21252},
   year={2024}
 }
 ```