Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ pipeline_tag: image-text-to-text
|
|
23 |
<!-- ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/66e93d483745423cbb14c5ff/fNxjr3en_onzbOv0sghpE.jpeg) -->
|
24 |
|
25 |
# EraX-VL-7B-V1
|
26 |
-
|
27 |
|
28 |
We are excited to introduce the EraX-VL-7B-v1 model, a robust multimodal model for OCR (optical character recognition) and VQA (visual question-answering) that excels in various languages, with a particular focus on Vietnamese. The EraX-VL-7B model stands out for its precise recognition capabilities across a range of documents, including medical forms, invoices, bills of sale, quotes, and medical records. This functionality is expected to be highly beneficial for hospitals, clinics, insurance companies, and other similar applications. Built on the solid foundation of the Qwen/Qwen2-VL-7B-Instruct, which we found to be of high quality and fluent in Vietnamese, EraX-VL-7B has been fine-tuned to enhance its performance. We plan to continue improving and releasing new versions for free, along with sharing performance benchmarks in the near future.
|
29 |
|
@@ -36,7 +36,7 @@ EraX-VL-7B-V1 is a young member of our EraX's LànhGPT repository of LLM models.
|
|
36 |
- **License:** apache-2.0
|
37 |
- **Finetuned from model:** Qwen/Qwen2-VL-7B-Instruct
|
38 |
|
39 |
-
|
40 |
Here we show a code snippet to show you how to use the chat model with `transformers` and `qwen_vl_utils`:
|
41 |
|
42 |
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1CnSxtWDLG48-NQh7wk9_z8WI7J4OY_Ci?usp=sharing)
|
@@ -155,12 +155,22 @@ Hình ảnh là một biểu đồ thể hiện mối quan hệ giữa chỉ s
|
|
155 |
|
156 |
|
157 |
## Citation
|
158 |
-
|
159 |
-
- title={EraX-VL-7B-V1: A Highly Efficient Multimodal LLM for Vietnamese, especially for medical forms and bills.},
|
160 |
- author={Nguyễn Anh Nguyên and Nguyễn Hồ Nam (BCG) and Dũng Hoàng and Thục Phạm and Nhật Phạm},
|
161 |
- helpers={Khang Đoàn and AAA JS Company},
|
162 |
- contact={nguyen@erax.ai},
|
163 |
-
- organization={EraX}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
164 |
|
165 |
## References
|
166 |
[1] Yang, An, et al. "Qwen2 technical report." arXiv preprint arXiv:2407.10671 (2024).
|
|
|
23 |
<!-- ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/66e93d483745423cbb14c5ff/fNxjr3en_onzbOv0sghpE.jpeg) -->
|
24 |
|
25 |
# EraX-VL-7B-V1
|
26 |
+
## Introduction
|
27 |
|
28 |
We are excited to introduce the EraX-VL-7B-v1 model, a robust multimodal model for OCR (optical character recognition) and VQA (visual question-answering) that excels in various languages, with a particular focus on Vietnamese. The EraX-VL-7B model stands out for its precise recognition capabilities across a range of documents, including medical forms, invoices, bills of sale, quotes, and medical records. This functionality is expected to be highly beneficial for hospitals, clinics, insurance companies, and other similar applications. Built on the solid foundation of the Qwen/Qwen2-VL-7B-Instruct, which we found to be of high quality and fluent in Vietnamese, EraX-VL-7B has been fine-tuned to enhance its performance. We plan to continue improving and releasing new versions for free, along with sharing performance benchmarks in the near future.
|
29 |
|
|
|
36 |
- **License:** apache-2.0
|
37 |
- **Finetuned from model:** Qwen/Qwen2-VL-7B-Instruct
|
38 |
|
39 |
+
## Quickstart
|
40 |
Here we show a code snippet to show you how to use the chat model with `transformers` and `qwen_vl_utils`:
|
41 |
|
42 |
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1CnSxtWDLG48-NQh7wk9_z8WI7J4OY_Ci?usp=sharing)
|
|
|
155 |
|
156 |
|
157 |
## Citation
|
158 |
+
<!-- - title={EraX-VL-7B-V1: A Highly Efficient Multimodal LLM for Vietnamese, especially for medical forms and bills.},
|
|
|
159 |
- author={Nguyễn Anh Nguyên and Nguyễn Hồ Nam (BCG) and Dũng Hoàng and Thục Phạm and Nhật Phạm},
|
160 |
- helpers={Khang Đoàn and AAA JS Company},
|
161 |
- contact={nguyen@erax.ai},
|
162 |
+
- organization={EraX} -->
|
163 |
+
|
164 |
+
```
|
165 |
+
@article{EraX-VL-7B-V1,
|
166 |
+
title={EraX-VL-7B-V1: A Highly Efficient Multimodal LLM for Vietnamese, especially for medical forms and bills},
|
167 |
+
author={Nguyễn Anh Nguyên and Nguyễn Hồ Nam (BCG) and Dũng Hoàng and Thục Phạm and Nhật Phạm},
|
168 |
+
helpers={Khang Đoàn and AAA JS Company},
|
169 |
+
contact={nguyen@erax.ai},
|
170 |
+
organization={EraX}
|
171 |
+
year={2024}
|
172 |
+
}
|
173 |
+
```
|
174 |
|
175 |
## References
|
176 |
[1] Yang, An, et al. "Qwen2 technical report." arXiv preprint arXiv:2407.10671 (2024).
|