view article Article LLaVA-o1: Let Vision Language Models Reason Step-by-Step By mikelabs โข Nov 19, 2024 โข 11
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper โข 2411.10440 โข Published Nov 15, 2024 โข 111
view article Article How to run Gemini Nano locally in your browser By Xenova โข Jul 11, 2024 โข 43
Sparsh Collection Models and datasets for Sparsh: Self-supervised touch representations for vision-based tactile sensing โข 15 items โข Updated Oct 24, 2024 โข 12
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 โข 9 items โข Updated Nov 27, 2024 โข 101
view article Article Advanced Flux Dreambooth LoRA Training with ๐งจ diffusers By linoyts โข Oct 21, 2024 โข 32
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention Paper โข 2410.10774 โข Published Oct 14, 2024 โข 25
MonoFormer: One Transformer for Both Diffusion and Autoregression Paper โข 2409.16280 โข Published Sep 24, 2024 โข 17
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 โข 15 items โข Updated 28 days ago โข 551
3D Collection Stability AI's suite of models for 3D generation โข 5 items โข Updated Aug 9, 2024 โข 33
Tora: Trajectory-oriented Diffusion Transformer for Video Generation Paper โข 2407.21705 โข Published Jul 31, 2024 โข 27
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models โข 11 items โข Updated 28 days ago โข 637