license: llama3.2 | |
datasets: | |
- osunlp/Multimodal-Mind2Web | |
base_model: | |
- meta-llama/Llama-3.2-11B-Vision-Instruct | |
This is a finetuned Llama-3.2-11B-Vision-Instruct model, the dataset used is Multimodal-Mind2Web dataset. | |
Step by step guide: https://github.com/roywei/llama-3-2-vision-finetune |