nanoLLaVA-1.5 is here! Same size (1B), better performance š„š„š„ And it is much more powerful than v1.0 Try it out now on HF Spaces: qnguyen3/nanoLLaVA Model: qnguyen3/nanoLLaVA-1.5
š Introducing nanoLLaVA, a powerful multimodal AI model that packs the capabilities of a 1B parameter vision language model into just 5GB of VRAM. š This makes it an ideal choice for edge devices, bringing cutting-edge visual understanding and generation to your devices like never before. š±š»
Under the hood, nanoLLaVA is based on the powerful vilm/Quyen-SE-v0.1 (my Qwen1.5-0.5B finetune) and Google's impressive google/siglip-so400m-patch14-384. š§ The model is trained using a data-centric approach to ensure optimal performance. š
In the spirit of transparency and collaboration, all code and model weights are open-sourced under the Apache 2.0 license. š¤
Current LLMs are very susceptible to generating toxic, harmful and even dangerous content. They can also generate outputs with gender or racial biases. The Biden-Harris Executive Order https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence) sets forth guidelines on what is considered a safe AI system. Following up on these guidelines, we present the world's first open source Biden-Harris Executive Order Red teamed Multilingual Language Model: Aurora-M. Inspired by BigScience, the model is trained on 5 languages: English, Hindi, Japanese, Vietnamese and Finnish.