PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations Paper • 2412.05994 • Published 14 days ago • 16
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published 11 days ago • 51
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published 10 days ago • 90
Large Action Models: From Inception to Implementation Paper • 2412.10047 • Published 10 days ago • 26
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper • 2412.11974 • Published 6 days ago • 7
SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video Paper • 2412.09982 • Published 10 days ago • 7
GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs Paper • 2412.11258 • Published 7 days ago • 12
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published 6 days ago • 33
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 10 days ago • 69
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents Paper • 2412.13194 • Published 5 days ago • 10
Emergence of Abstractions: Concept Encoding and Decoding Mechanism for In-Context Learning in Transformers Paper • 2412.12276 • Published 6 days ago • 14
Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models Paper • 2412.12606 • Published 6 days ago • 40
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge Paper • 2412.13670 • Published 5 days ago • 4
Learning from Massive Human Videos for Universal Humanoid Pose Control Paper • 2412.14172 • Published 4 days ago • 10
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces Paper • 2412.14171 • Published 4 days ago • 20
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published 4 days ago • 41
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 5 days ago • 93