|
--- |
|
pipeline_tag: image-text-to-text |
|
--- |
|
<br> |
|
<br> |
|
|
|
# Math-LLaVA-13B Model Card |
|
|
|
## Model details |
|
|
|
**Model type:** |
|
Math-LLaVA is an open-source MLLM by fine-tuning LLaVA-1.5-13B on selected and GPT4-Vision-assisted synthesized [MathV360K](https://huggingface.co/datasets/Zhiqiang007/MathV360K/tree/main) data. |
|
|
|
**Model date:** |
|
Math-LLaVA-13B was trained in June 2024. |
|
|
|
**Paper or resources for more information:** |
|
[[Paper](http://arxiv.org/abs/2406.17294)] [[Code](https://github.com/HZQ950419/Math-LLaVA)] |
|
|
|
## License |
|
Llama 2 is licensed under the LLAMA 2 Community License, |
|
Copyright (c) Meta Platforms, Inc. All Rights Reserved. |
|
|
|
## Intended use |
|
**Primary intended uses:** |
|
The primary use of Math-LLaVA is research on multimodal large language models, multimodal reasoning and question answering. |
|
|
|
**Primary intended users:** |
|
The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence. |
|
|
|
## Training dataset |
|
- MathV360K instruction-tuning data |
|
|
|
## Evaluation dataset |
|
A collection of 3 benchmarks, including 2 multimodal mathematical reasoning benchmarks and 1 benchmark for multi-discipline multimodal reasoning. |
|
|