Flow Judge v0.1 held-out test datasets Collection This collection contains held-out splits for testing Flow-Judge-v0.1. • 4 items • Updated Sep 14, 2024 • 2
Flow Judge Datasets v0 Collection Synthetic datasets produced for training our open LM judge • 11 items • Updated Sep 2, 2024
Flow LM Judge Evaluation Datasets Collection Collection of datasets used for evaluating our Flow LM Judge • 10 items • Updated Aug 27, 2024
Flow LM Judge Evaluation Datasets Collection Collection of datasets used for evaluating our Flow LM Judge • 10 items • Updated Aug 27, 2024