Submitted by UglyToilet 65 Controllable Text Generation for Large Language Models: A Survey · 11 authors 2
Submitted by akhaliq 57 Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications · 39 authors 2
Submitted by akhaliq 51 Show-o: One Single Transformer to Unify Multimodal Understanding and Generation · 10 authors 2
Submitted by akhaliq 36 xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations · 19 authors 5
Submitted by Liuff23 30 DreamCinema: Cinematic Transfer with Free Camera and 3D Character · 6 authors 2
Submitted by IAMJB 24 The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design · 5 authors 1
Submitted by IAMJB 23 Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese · 8 authors 2
Submitted by HenryCai1129 15 Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search · 8 authors 2
Submitted by topyun 14 SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models · 4 authors 3
Submitted by IAMJB 12 ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM · 9 authors 1
Submitted by yyyin 12 SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs · 10 authors 2
Submitted by YunxinLi 8 Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation · 8 authors 2
Submitted by akhaliq 7 Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound · 4 authors 2
Submitted by amanchadha 6 Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs · 5 authors 3