base_model: | |
- meta-llama/Llama-3.2-1B | |
pipeline_tag: text-generation | |
library_name: transformers | |
# Llama-3.2-1B-Vision | |
A vision-enhanced version of the Llama-3.2-1B language model, capable of understanding and describing images while maintaining the base model's language capabilities. | |
## Model Details | |
- **Base Model**: Llama-3.2-1B | |
- **Model Type**: Vision-Language Model | |
- **Last Updated**: December ?, 2024 | |
- **Model Architecture**: Llama architecture with SigLIP vision encoder |