MAmmoTH-VL
/

MAmmoTH-VL-8B

Model card Files Files and versions Community

KerwinJob commited on Dec 9, 2024

Commit

055199f

·

verified ·

1 Parent(s): 13954e4

Update README.md

Files changed (1) hide show

README.md +10 -4

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # MAmmoTH-VL-8B
-[🏠 Homepage](https://mammoth-vl.github.io/) | [🤖 MAmmoTH-VL-8B](https://huggingface.co/MMSFT/MAmmoTH-VL-8B) | [💻 Code](https://github.com/MAmmoTH-VL/MAmmoTH-VL) | [📄 Arxiv](https://arxiv.org/abs/2410.16153) | [📕 PDF](https://arxiv.org/pdf/2410.16153) | [🖥️ Demo](https://huggingface.co/spaces/MMSFT/MAmmoTH-VL-8B)
 # Abstract
 Open-source multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks. However, their reasoning capabilities remain constrained by existing instruction-tuning datasets, which were predominately repurposed from academic datasets such as VQA, AI2D, and ChartQA. These datasets target simplistic tasks, and only provide phrase-level answers without any intermediate rationales.
@@ -26,8 +26,14 @@ We highlight different groups of models with different colors: <span style="back
 ## Citing the Model
-**BibTeX Citation:**
 ```
-xxx
 ```

 # MAmmoTH-VL-8B
+[🏠 Homepage](https://mammoth-vl.github.io/) | [🤖 MAmmoTH-VL-8B](https://huggingface.co/MAmmoTH-VL/MAmmoTH-VL-8B) | [💻 Code](https://github.com/orgs/MAmmoTH-VL/MAmmoTH-VL) | [📄 Arxiv](https://arxiv.org/abs/2412.05237) | [📕 PDF](https://arxiv.org/pdf/2412.05237) | [🖥️ Demo](https://huggingface.co/spaces/paralym/MAmmoTH-VL-8B)
 # Abstract
 Open-source multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks. However, their reasoning capabilities remain constrained by existing instruction-tuning datasets, which were predominately repurposed from academic datasets such as VQA, AI2D, and ChartQA. These datasets target simplistic tasks, and only provide phrase-level answers without any intermediate rationales.
 ## Citing the Model
 ```
+@article{guo2024mammothvlelicitingmultimodalreasoning,
+      title={MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale},
+      author={Jarvis Guo and Tuney Zheng and Yuelin Bai and Bo Li and Yubo Wang and King Zhu and Yizhi Li and Graham Neubig and Wenhu Chen and Xiang Yue},
+      year={2024},
+      eprint={2412.05237},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2412.05237},
+}
 ```