voidism
/

SelfCite-8B

Model card Files Files and versions Community

voidism commited on about 1 month ago

Commit

f80f6e2

·

1 Parent(s): e452aff

add readme

Files changed (2) hide show

README.md +36 -0
all_results.json +0 -9

README.md CHANGED Viewed

@@ -1,3 +1,39 @@
 ---
 license: llama3.1
 ---

 ---
 license: llama3.1
 ---
+# SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
+[![License: MIT](https://img.shields.io/badge/License-MIT-g.svg)](https://opensource.org/licenses/MIT)
+[![Arxiv](https://img.shields.io/badge/arXiv-2502.09604-B21A1B)](https://arxiv.org/abs/2502.09604)
+[![Hugging Face Transformers](https://img.shields.io/badge/%F0%9F%A4%97-Transformers-blue)](https://github.com/huggingface/transformers)
+[![Tweet](https://img.shields.io/twitter/url/http/shields.io.svg?style=social)](https://twitter.com/YungSungChuang/)
+[![GitHub Stars](https://img.shields.io/github/stars/voidism/SelfCite?style=social)](https://github.com/voidism/SelfCite/stargazers)
+Paper: https://arxiv.org/abs/2502.09604
+Authors: [Yung-Sung Chuang](https://people.csail.mit.edu/yungsung/)$^\dagger$, [Benjamin Cohen-Wang](https://bencw99.github.io/)$^\dagger$, [Shannon Zejiang Shen](https://www.szj.io/)$^\dagger$, [Zhaofeng Wu](https://zhaofengwu.github.io/)$^\dagger$, [Hu Xu](https://howardhsu.github.io/)$^\ddagger$, [Xi Victoria Lin](https://victorialin.org/)$^\ddagger$, [James Glass](https://people.csail.mit.edu/jrg/)$^\dagger$, [Shang-Wen Li](https://swdanielli.github.io/)$^\ddagger$, [Wen-tau Yih](https://scottyih.org/)$^\ddagger$
+$^\dagger$ Massachusetts Institute of Technology, $^\ddagger$ Meta AI
+![main-fig](SelfCite.png)
+This model is the SimPO fine-tuned model from LongCite-8B.
+Please refer to our GitHub repository for usage and more details: https://github.com/voidism/SelfCite
+## Citation
+Please cite our paper as well as LongCite if they are helpful to your work!
+```bibtex
+@inproceedings{chuang2025selfcite,
+  title={SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models},
+  author={Yung-Sung Chuang and Benjamin Cohen-Wang and Shannon Zejiang Shen and Zhaofeng Wu and Hu Xu and Xi Victoria Lin and James Glass and Shang-Wen Li and Wen-tau Yih},
+  journal={arXiv preprint arXiv:2502.09604},
+  year={2025}
+}
+@article{zhang2024longcite,
+  title = {LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA}
+  author={Jiajie Zhang and Yushi Bai and Xin Lv and Wanjun Gu and Danqing Liu and Minhao Zou and Shulin Cao and Lei Hou and Yuxiao Dong and Ling Feng and Juanzi Li},
+  journal={arXiv preprint arXiv:2409.02897},
+  year={2024}
+}
+```

all_results.json DELETED Viewed

@@ -1,9 +0,0 @@
-{
-    "epoch": 1.0,
-    "total_flos": 0.0,
-    "train_loss": 1.8338732454511855,
-    "train_runtime": 6099.2148,
-    "train_samples": 2013,
-    "train_samples_per_second": 0.33,
-    "train_steps_per_second": 0.041
-}