HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5 • 64
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details Paper • 2411.03047 • Published Nov 5 • 8
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D Paper • 2411.02336 • Published Nov 4 • 23
HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models Paper • 2410.22901 • Published Oct 30 • 8
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper • 2410.22366 • Published Oct 28 • 77
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation Paper • 2410.18666 • Published Oct 24 • 19
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20 • 39
FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations Paper • 2411.10818 • Published Nov 16 • 24
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19 • 47
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published Nov 16 • 44
AnimateAnything: Consistent and Controllable Animation for Video Generation Paper • 2411.10836 • Published Nov 16 • 23
SlimLM: An Efficient Small Language Model for On-Device Document Assistance Paper • 2411.09944 • Published Nov 15 • 12
FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on Paper • 2411.10499 • Published Nov 15 • 13
StableV2V: Stablizing Shape Consistency in Video-to-Video Editing Paper • 2411.11045 • Published Nov 17 • 11
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement Paper • 2411.06558 • Published Nov 10 • 34
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15 • 68
From CISC to RISC: language-model guided assembly transpilation Paper • 2411.16341 • Published Nov 25 • 11
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22 • 56
GraPE: A Generate-Plan-Edit Framework for Compositional T2I Synthesis Paper • 2412.06089 • Published 22 days ago • 4
Mogo: RQ Hierarchical Causal Transformer for High-Quality 3D Human Motion Generation Paper • 2412.07797 • Published 25 days ago • 11
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published 14 days ago • 41
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published 14 days ago • 33