Submitted by akhaliq 60 Understanding LLMs: A Comprehensive Overview from Training to Inference · 21 authors 2
Submitted by akhaliq 30 Instruct-Imagen: Image Generation with Multi-modal Instruction · 12 authors 3
Submitted by akhaliq 25 Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation · 3 authors 2
Submitted by akhaliq 12 What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs · 8 authors 1
Submitted by akhaliq 11 LLaVA-$φ$: Efficient Multi-Modal Assistant with Small Language Model · 6 authors 1
Submitted by akhaliq 9 ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers · 7 authors 1
Submitted by akhaliq 6 Improving Diffusion-Based Image Synthesis with Context Prediction · 8 authors 1
Submitted by akhaliq 6 FMGS: Foundation Model Embedded 3D Gaussian Splatting for Holistic 3D Scene Understanding · 5 authors 1
Submitted by akhaliq 4 Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers · 3 authors 1