HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks Paper • 2410.12381 • Published Oct 16 • 42
Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code Paper • 2409.19715 • Published Sep 29 • 8
NoTeeline: Supporting Real-Time Notetaking from Keypoints with Large Language Models Paper • 2409.16493 • Published Sep 24 • 9
The Prompt Report: A Systematic Survey of Prompting Techniques Paper • 2406.06608 • Published Jun 6 • 56
Multimodal Foundation Models: From Specialists to General-Purpose Assistants Paper • 2309.10020 • Published Sep 18, 2023 • 40