Large Language Monkeys: Scaling Inference Compute with Repeated Sampling Paper • 2407.21787 • Published Jul 31 • 10
Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion Paper • 2412.14462 • Published 4 days ago • 15
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution Paper • 2412.15213 • Published 3 days ago • 19
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published 3 days ago • 27
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published 6 days ago • 37
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 3 days ago • 78
Learning from Massive Human Videos for Universal Humanoid Pose Control Paper • 2412.14172 • Published 4 days ago • 10
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer Paper • 2412.13871 • Published 4 days ago • 17
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation Paper • 2412.14015 • Published 4 days ago • 11
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment Paper • 2412.13746 • Published 4 days ago • 8
FashionComposer: Compositional Fashion Image Generation Paper • 2412.14168 • Published 4 days ago • 16
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published 4 days ago • 41
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation Paper • 2412.10704 • Published 8 days ago • 14
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Paper • 2412.13018 • Published 5 days ago • 39
Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models Paper • 2412.12606 • Published 5 days ago • 40
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents Paper • 2412.13194 • Published 5 days ago • 10
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated 16 days ago • 180