alexchen4ai
commited on
Commit
•
e91dda6
1
Parent(s):
48fbc9a
Update README.md
Browse files
README.md
CHANGED
@@ -10,10 +10,10 @@ tags:
|
|
10 |
|
11 |
## Introduction
|
12 |
|
13 |
-
Omnivision is a compact, sub-billion (968M) multimodal model for processing both visual and text inputs, optimized for edge devices.
|
14 |
|
15 |
- **9x Token Reduction**: Reduces image tokens from 729 to 81, cutting latency and computational cost.
|
16 |
-
- **
|
17 |
|
18 |
**Quick Links:**
|
19 |
1. Interactive Demo in our [Hugging Face Space](https://huggingface.co/spaces/NexaAIDev/omnivlm-dpo-demo).
|
|
|
10 |
|
11 |
## Introduction
|
12 |
|
13 |
+
Omnivision is a compact, sub-billion (968M) multimodal model for processing both visual and text inputs, optimized for edge devices. Improved on LLaVA's architecture, it features:
|
14 |
|
15 |
- **9x Token Reduction**: Reduces image tokens from 729 to 81, cutting latency and computational cost.
|
16 |
+
- **Trustworthy result**: Reduces hallucinations using **DPO** training from trustworthy data.
|
17 |
|
18 |
**Quick Links:**
|
19 |
1. Interactive Demo in our [Hugging Face Space](https://huggingface.co/spaces/NexaAIDev/omnivlm-dpo-demo).
|