KerwinJob commited on
Commit
055199f
β€’
1 Parent(s): 13954e4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -1,6 +1,6 @@
1
  # MAmmoTH-VL-8B
2
 
3
- [🏠 Homepage](https://mammoth-vl.github.io/) | [πŸ€– MAmmoTH-VL-8B](https://huggingface.co/MMSFT/MAmmoTH-VL-8B) | [πŸ’» Code](https://github.com/MAmmoTH-VL/MAmmoTH-VL) | [πŸ“„ Arxiv](https://arxiv.org/abs/2410.16153) | [πŸ“• PDF](https://arxiv.org/pdf/2410.16153) | [πŸ–₯️ Demo](https://huggingface.co/spaces/MMSFT/MAmmoTH-VL-8B)
4
 
5
  # Abstract
6
  Open-source multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks. However, their reasoning capabilities remain constrained by existing instruction-tuning datasets, which were predominately repurposed from academic datasets such as VQA, AI2D, and ChartQA. These datasets target simplistic tasks, and only provide phrase-level answers without any intermediate rationales.
@@ -26,8 +26,14 @@ We highlight different groups of models with different colors: <span style="back
26
 
27
  ## Citing the Model
28
 
29
- **BibTeX Citation:**
30
-
31
  ```
32
- xxx
 
 
 
 
 
 
 
 
33
  ```
 
1
  # MAmmoTH-VL-8B
2
 
3
+ [🏠 Homepage](https://mammoth-vl.github.io/) | [πŸ€– MAmmoTH-VL-8B](https://huggingface.co/MAmmoTH-VL/MAmmoTH-VL-8B) | [πŸ’» Code](https://github.com/orgs/MAmmoTH-VL/MAmmoTH-VL) | [πŸ“„ Arxiv](https://arxiv.org/abs/2412.05237) | [πŸ“• PDF](https://arxiv.org/pdf/2412.05237) | [πŸ–₯️ Demo](https://huggingface.co/spaces/paralym/MAmmoTH-VL-8B)
4
 
5
  # Abstract
6
  Open-source multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks. However, their reasoning capabilities remain constrained by existing instruction-tuning datasets, which were predominately repurposed from academic datasets such as VQA, AI2D, and ChartQA. These datasets target simplistic tasks, and only provide phrase-level answers without any intermediate rationales.
 
26
 
27
  ## Citing the Model
28
 
 
 
29
  ```
30
+ @article{guo2024mammothvlelicitingmultimodalreasoning,
31
+ title={MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale},
32
+ author={Jarvis Guo and Tuney Zheng and Yuelin Bai and Bo Li and Yubo Wang and King Zhu and Yizhi Li and Graham Neubig and Wenhu Chen and Xiang Yue},
33
+ year={2024},
34
+ eprint={2412.05237},
35
+ archivePrefix={arXiv},
36
+ primaryClass={cs.CL},
37
+ url={https://arxiv.org/abs/2412.05237},
38
+ }
39
  ```