GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Paper • 2409.06595 • Published 16 days ago • 37