MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs Paper • 2410.04698 • Published Oct 7 • 13
cornfieldrm/pair-preference-dataset-700K_subset-4-of-4_gemma-2b_1of4_iter1_bs128_lr1e-5_conf-0.7_slic Viewer • Updated Jul 25 • 72.1k • 32
cornfieldrm/pair-preference-dataset-700K_subset-3-of-4_gemma-2b_1of4_iter1_bs128_lr1e-5_conf-0.7_slic Viewer • Updated Jul 25 • 71.9k • 31
cornfieldrm/pair-preference-dataset-700K_subset-2-of-4_gemma-2b_1of4_iter1_bs128_lr1e-5_conf-0.7_slic Viewer • Updated Jul 25 • 72.4k • 34
cornfieldrm/pair-preference-dataset-700K_subset-4-of-4_gemma-2b_1of4_iter1_bs128_lr1e-5_conf-0.7 Viewer • Updated Jul 25 • 72.1k • 32
cornfieldrm/pair-preference-dataset-700K_subset-3-of-4_gemma-2b_1of4_iter1_bs128_lr1e-5_conf-0.7 Viewer • Updated Jul 25 • 71.9k • 33
cornfieldrm/pair-preference-dataset-700K_subset-2-of-4_gemma-2b_1of4_iter1_bs128_lr1e-5_conf-0.7 Viewer • Updated Jul 25 • 72.4k • 32
cornfieldrm/pair-preference-dataset-700K_subset-15-out-of-16_llama3-8b-it_1-of-16_iter_4_slic Viewer • Updated Jun 13 • 398k • 32
cornfieldrm/pair-preference-dataset-700K_subset-15-out-of-16_llama3-8b-it_1-of-16_iter_4 Viewer • Updated Jun 13 • 398k • 46
cornfieldrm/pair-preference-dataset-700K_subset-1-out-of-16_slic Viewer • Updated Jun 10 • 43.7k • 35
cornfieldrm/pair-preference-dataset-700K_subset-15-out-of-16_standard Viewer • Updated Jun 10 • 656k • 50
cornfieldrm/pair-preference-dataset-700K_subset-1-out-of-16_standard Viewer • Updated Jun 10 • 43.7k • 35
cornfieldrm/pair-preference-dataset-700K_subset-7-out-of-8_gemma-2b-it_1-of-8_slic Viewer • Updated Jun 10 • 68.6k • 34
cornfieldrm/pair-preference-dataset-700K_subset-7-out-of-8_gemma-2b-it_1-of-8 Viewer • Updated Jun 10 • 68.6k • 33
cornfieldrm/pair-preference-dataset-700K_subset-1-out-of-2_direct_slic Viewer • Updated Jun 10 • 350k • 35
cornfieldrm/pair-preference-dataset-700K_subset-1-out-of-8_direct_slic Viewer • Updated Jun 10 • 87.4k • 35
cornfieldrm/pair-preference-dataset-700K_subset-15-out-of-16_gemma-2b-it_1-of-16_slic Viewer • Updated Jun 10 • 686 • 32
cornfieldrm/pair-preference-dataset-700K_subset-2-of-2_gemma-2b-it_1-of-2_slic Viewer • Updated Jun 10 • 155k • 35