GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers Paper • 2412.09722 • Published 27 days ago • 5
VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric Information Paper • 2412.00947 • Published Dec 1, 2024 • 7
AAAR-1.0: Assessing AI's Potential to Assist Research Paper • 2410.22394 • Published Oct 29, 2024 • 14
IMDb data from Two Generations, from 1979 to 2019; Part one, Dataset Introduction and Preliminary Analysis Paper • 2005.14147 • Published May 28, 2020
Evaluating LLMs at Detecting Errors in LLM Responses Paper • 2404.03602 • Published Apr 4, 2024 • 2
DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data Paper • 2311.09805 • Published Nov 16, 2023 • 3
WiCE: Real-World Entailment for Claims in Wikipedia Paper • 2303.01432 • Published Mar 2, 2023 • 2
Is Prompt All You Need? No. A Comprehensive and Broader View of Instruction Learning Paper • 2303.10475 • Published Mar 18, 2023 • 2
MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following Paper • 2312.02436 • Published Dec 5, 2023 • 1
UMIE: Unified Multimodal Information Extraction with Instruction Tuning Paper • 2401.03082 • Published Jan 5, 2024 • 1
Adaptive Chameleon or Stubborn Sloth: Unraveling the Behavior of Large Language Models in Knowledge Clashes Paper • 2305.13300 • Published May 22, 2023 • 2
TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 34