|
--- |
|
license: apache-2.0 |
|
task_categories: |
|
- image-text-to-text |
|
--- |
|
|
|
This repository contains the data for [LOVA3: Learning to Visual Question Answering, Asking and Assessment](https://huggingface.co/papers/2405.14974). |
|
LOVA3 is a framework designed to equip MLLMs with the capabilities to answer, ask, and assess questions in the context of images. |
|
|
|
Code: https://github.com/showlab/LOVA3 |
|
|
|
## ๐ Citation |
|
|
|
If you find LOVA3 useful, please cite using this BibTeX: |
|
|
|
```bibtex |
|
@inproceedings{ |
|
zhao2024lova, |
|
title={{LOVA}3: Learning to Visual Question Answering, Asking and Assessment}, |
|
author={Hengyuan Zhao and Pan Zhou and Difei Gao and Zechen Bai and Mike Zheng Shou}, |
|
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems}, |
|
year={2024}, |
|
url={https://openreview.net/forum?id=vIOKLMl6wu} |
|
} |
|
``` |