3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering Paper ā¢ 2501.05131 ā¢ Published 9 days ago ā¢ 32
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper ā¢ 2501.08313 ā¢ Published 4 days ago ā¢ 257
MangaNinja: Line Art Colorization with Precise Reference Following Paper ā¢ 2501.08332 ā¢ Published 4 days ago ā¢ 48
YuLan-Mini: An Open Data-efficient Language Model Paper ā¢ 2412.17743 ā¢ Published 26 days ago ā¢ 64
MMFactory: A Universal Solution Search Engine for Vision-Language Tasks Paper ā¢ 2412.18072 ā¢ Published 25 days ago ā¢ 17
Molar: Multimodal LLMs with Collaborative Filtering Alignment for Enhanced Sequential Recommendation Paper ā¢ 2412.18176 ā¢ Published 25 days ago ā¢ 15
Deliberation in Latent Space via Differentiable Cache Augmentation Paper ā¢ 2412.17747 ā¢ Published 26 days ago ā¢ 29
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation Paper ā¢ 2412.13649 ā¢ Published Dec 18, 2024 ā¢ 20
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper ā¢ 2412.15322 ā¢ Published 30 days ago ā¢ 18
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper ā¢ 2412.14161 ā¢ Published about 1 month ago ā¢ 50
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials Paper ā¢ 2412.09605 ā¢ Published Dec 12, 2024 ā¢ 28
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper ā¢ 2412.04862 ā¢ Published Dec 6, 2024 ā¢ 50
CompCap: Improving Multimodal Large Language Models with Composite Captions Paper ā¢ 2412.05243 ā¢ Published Dec 6, 2024 ā¢ 18
PanoDreamer: 3D Panorama Synthesis from a Single Image Paper ā¢ 2412.04827 ā¢ Published Dec 6, 2024 ā¢ 10
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis Paper ā¢ 2412.01819 ā¢ Published Dec 2, 2024 ā¢ 35
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification Paper ā¢ 2411.19638 ā¢ Published Nov 29, 2024 ā¢ 6