Submitted by yuangpeng 52 DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation · 10 authors 3
Submitted by ellisbrown 45 Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs · 14 authors 3
Submitted by terryyz 39 BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions · 33 authors 8
Submitted by Royir 33 Evaluating D-MERIT of Partial-annotation on Information Retrieval · 7 authors 2
Submitted by zlzheng 22 VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models · 5 authors 2
Submitted by YiDuo1999 18 Efficient Continual Pre-training by Mitigating the Stability Gap · 5 authors 1
Submitted by Kthyeon 16 Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters · 5 authors 1
Submitted by zlzheng 15 Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers · 4 authors 1
Submitted by ShengdingHu 13 Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models · 9 authors 2
Submitted by jlko 13 Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs · 6 authors 1
Submitted by yongzx 12 Preference Tuning For Toxicity Mitigation Generalizes Across Languages · 3 authors 1
Submitted by CCCCCC 10 AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models · 9 authors 2
Submitted by sherzod-hakimov 9 How Many Parameters Does it Take to Change a Light Bulb? Evaluating Performance in Self-Play of Conversational Games as a Function of Model Characteristics · 4 authors 1
Submitted by cydhsieh01 6 Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization · 11 authors 1
Submitted by cattana 5 Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations · 11 authors 1
Submitted by BrianatCambridge 5 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models · 10 authors 1
Submitted by nicozilber 4 Repulsive Score Distillation for Diverse Sampling of Diffusion Models · 3 authors 2
Submitted by SinclairWang 3 OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far? · 4 authors 2