Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
RLHFlow
's Collections
Standard-format-preference-dataset
Mixture-of-preference-reward-modeling
RM-Bradley-Terry
PM-pair
Online RLHF
Standard-format-preference-dataset
updated
May 8
We collect the open-source datasets and process them into the standard format.
Upvote
12
+2
RLHFlow/UltraFeedback-preference-standard
Viewer
•
Updated
Apr 27
•
340k
•
1.06k
•
6
RLHFlow/Helpsteer-preference-standard
Viewer
•
Updated
Apr 27
•
37.1k
•
796
•
2
RLHFlow/HH-RLHF-Helpful-standard
Viewer
•
Updated
Apr 27
•
115k
•
299
•
1
RLHFlow/Orca-distibalel-standard
Viewer
•
Updated
Apr 28
•
6.93k
•
259
•
1
RLHFlow/Capybara-distibalel-Filter-standard
Viewer
•
Updated
Apr 28
•
14.8k
•
243
RLHFlow/CodeUltraFeedback-standard
Viewer
•
Updated
Apr 27
•
50.2k
•
237
•
3
RLHFlow/UltraInteract-filtered-standard
Viewer
•
Updated
Apr 28
•
162k
•
231
•
2
RLHFlow/PKU-SafeRLHF-30K-standard
Viewer
•
Updated
Apr 29
•
26.9k
•
241
•
2
RLHFlow/Argilla-Math-DPO-standard
Viewer
•
Updated
Apr 30
•
2.42k
•
238
•
2
RLHFlow/Prometheus2-preference-standard
Viewer
•
Updated
May 5
•
200k
•
236
•
2
RLHFlow/SHP-standard
Viewer
•
Updated
May 9
•
93.3k
•
244
RLHFlow/HH-RLHF-Harmless-and-RedTeam-standard
Viewer
•
Updated
May 8
•
42.3k
•
241
•
2
Upvote
12
+8
Share collection
View history
Collection guide
Browse collections