--- license: mit datasets: - YuukiAsuna/VietnameseTableVQA language: - vi base_model: - 5CD-AI/Vintern-1B-v2 pipeline_tag: document-question-answering library_name: transformers --- # Model Card for Model ID Vintern-1B-v2-ViTable-docvqa is a fine-tuned version of the 5CD-AI/Vintern-1B-v2 multimodal model for the Vietnamese DocVQA (Table data) ## Benchmarks To be developed later ## Quickstart To be developed later **Citation:** ```bibtex @misc{doan2024vintern1befficientmultimodallarge, title={Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese}, author={Khang T. Doan and Bao G. Huynh and Dung T. Hoang and Thuc D. Pham and Nhat H. Pham and Quan T. M. Nguyen and Bang Q. Vo and Suong N. Hoang}, year={2024}, eprint={2408.12480}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2408.12480}, } ```