Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
# MAmmoTH-VL-8B
|
2 |
|
3 |
-
[π Homepage](https://mammoth-vl.github.io/) | [π€ MAmmoTH-VL-8B](https://huggingface.co/MAmmoTH-VL/MAmmoTH-VL-8B) | [π» Code](https://github.com/
|
4 |
|
5 |
# Abstract
|
6 |
Open-source multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks. However, their reasoning capabilities remain constrained by existing instruction-tuning datasets, which were predominately repurposed from academic datasets such as VQA, AI2D, and ChartQA. These datasets target simplistic tasks, and only provide phrase-level answers without any intermediate rationales.
|
|
|
1 |
# MAmmoTH-VL-8B
|
2 |
|
3 |
+
[π Homepage](https://mammoth-vl.github.io/) | [π€ MAmmoTH-VL-8B](https://huggingface.co/MAmmoTH-VL/MAmmoTH-VL-8B) | [π» Code](https://github.com/MAmmoTH-VL/MAmmoTH-VL) | [π Arxiv](https://arxiv.org/abs/2412.05237) | [π PDF](https://arxiv.org/pdf/2412.05237) | [π₯οΈ Demo](https://huggingface.co/spaces/paralym/MAmmoTH-VL-8B)
|
4 |
|
5 |
# Abstract
|
6 |
Open-source multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks. However, their reasoning capabilities remain constrained by existing instruction-tuning datasets, which were predominately repurposed from academic datasets such as VQA, AI2D, and ChartQA. These datasets target simplistic tasks, and only provide phrase-level answers without any intermediate rationales.
|