Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper β’ 2410.22366 β’ Published Oct 28 β’ 77
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper β’ 2402.17764 β’ Published Feb 27 β’ 603
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model Paper β’ 2402.03766 β’ Published Feb 6 β’ 12
Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases Paper β’ 2312.15011 β’ Published Dec 22, 2023 β’ 15
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution Paper β’ 2401.00935 β’ Published Jan 1 β’ 17
Context Tuning for Retrieval Augmented Generation Paper β’ 2312.05708 β’ Published Dec 9, 2023 β’ 17
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis Paper β’ 2311.12454 β’ Published Nov 21, 2023 β’ 30
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction Paper β’ 2311.12024 β’ Published Nov 20, 2023 β’ 19
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning Paper β’ 2311.12631 β’ Published Nov 21, 2023 β’ 13
Make Pixels Dance: High-Dynamic Video Generation Paper β’ 2311.10982 β’ Published Nov 18, 2023 β’ 67
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression Paper β’ 2311.10794 β’ Published Nov 17, 2023 β’ 24
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning Paper β’ 2311.10709 β’ Published Nov 17, 2023 β’ 24
UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework Paper β’ 2311.10125 β’ Published Nov 16, 2023 β’ 4
SoundCam: A Dataset for Finding Humans Using Room Acoustics Paper β’ 2311.03517 β’ Published Nov 6, 2023 β’ 10
Levels of AGI: Operationalizing Progress on the Path to AGI Paper β’ 2311.02462 β’ Published Nov 4, 2023 β’ 34
The Generative AI Paradox: "What It Can Create, It May Not Understand" Paper β’ 2311.00059 β’ Published Oct 31, 2023 β’ 18
Towards Understanding Sycophancy in Language Models Paper β’ 2310.13548 β’ Published Oct 20, 2023 β’ 4