Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,9 @@ base_model:
|
|
9 |
pipeline_tag: text-classification
|
10 |
---
|
11 |
|
12 |
-
This is the base reward model used in the paper "[DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging](https://arxiv.org/abs/2407.01470)".
|
|
|
|
|
13 |
|
14 |
For the detailed information about this model, please refer to our paper.
|
15 |
|
|
|
9 |
pipeline_tag: text-classification
|
10 |
---
|
11 |
|
12 |
+
This is the base reward model ("LLaMA-2 RM") used in the paper "[DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging](https://arxiv.org/abs/2407.01470)".
|
13 |
+
|
14 |
+
The detailed training/evaluation information can be found at https://api.wandb.ai/links/merge_exp/g56s1tul.
|
15 |
|
16 |
For the detailed information about this model, please refer to our paper.
|
17 |
|