This repository contains the model for LOVA3: Learning to Visual Question Answering, Asking and Assessment. LOVA3 is a framework designed to equip MLLMs with the capabilities to answer, ask, and assess questions in the context of images.

Code: https://github.com/showlab/LOVA3

πŸŽ“ Citation

If you find LOVA3 useful, please cite using this BibTeX:

@inproceedings{
    zhao2024lova,
    title={{LOVA}3: Learning to Visual Question Answering, Asking and Assessment},
    author={Hengyuan Zhao and Pan Zhou and Difei Gao and Zechen Bai and Mike Zheng Shou},
    booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
    year={2024},
    url={https://openreview.net/forum?id=vIOKLMl6wu}
}
Downloads last month
7
Safetensors
Model size
1.72B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including ZechenBai/LOVA3-llava-v1.5-phi1.5-baseline