miulab
/

llama2-7b-ultrafeedback-rm

Text Classification

Model card Files Files and versions Community

hank0316 commited on Oct 3

Commit

9649702

•

1 Parent(s): 6ff46cd

Update README.md

Files changed (1) hide show

README.md +24 -3

README.md CHANGED Viewed

@@ -1,3 +1,24 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- argilla/ultrafeedback-binarized-preferences-cleaned
+language:
+- en
+base_model:
+- meta-llama/Llama-2-7b-hf
+pipeline_tag: text-classification
+---
+This is the base reward model used in the paper "[DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging](https://arxiv.org/abs/2407.01470)".
+For the detailed information about this model, please refer to our paper.
+If you found this model useful, please cite our paper:
+```
+@article{lin2024dogerm,
+  title={DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging},
+  author={Lin, Tzu-Han and Li, Chen-An and Lee, Hung-yi and Chen, Yun-Nung},
+  journal={arXiv preprint arXiv:2407.01470},
+  year={2024}
+}
+```