chuyi777 commited on
Commit
595c2d5
1 Parent(s): 9e5db35

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -1 +1,9 @@
1
- The Llama3-8b-based Reward Model was trained using OpenRLHF and a combination of datasets available at https://huggingface.co/datasets/OpenLLMAI/preference_dataset_mixture2_and_safe_pku
 
 
 
 
 
 
 
 
 
1
+ The Llama3-8b-based Reward Model was trained using OpenRLHF and a combination of datasets available at https://huggingface.co/datasets/OpenLLMAI/preference_dataset_mixture2_and_safe_pku.
2
+
3
+ ```
4
+ Cosine Scheduler
5
+ Learning Rate: 9e-6
6
+ Warmup Ratio: 0.03
7
+ Batch Size: 256
8
+ Epoch: 1
9
+ ```