Create README.md

Files changed (1) hide show

README.md ADDED Viewed

	@@ -0,0 +1 @@


1	+ The Llama3-8b-based Reward Model was trained using OpenRLHF and a combination of datasets available at https://huggingface.co/datasets/OpenLLMAI/preference_dataset_mixture2_and_safe_pku