HV-Khurdula
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -16,6 +16,9 @@ tags:
|
|
16 |
|
17 |
# Dua-Vision-Base
|
18 |
|
|
|
|
|
|
|
19 |
A Vision Encoder-Decoder model that doesn’t just caption images but generates questions and possible answers based on what it “sees.” Using ViT as the encoder and BART as the decoder, it’s built for image-based QA without the fluff.
|
20 |
|
21 |
Translation: feed it an image, and get back a useful question-answer pair. Perfect for creating and synthesizing data in image QA tasks. It’s one model, two tasks, and a lot of potential!
|
|
|
16 |
|
17 |
# Dua-Vision-Base
|
18 |
|
19 |
+
|
20 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f0cf1adcac1f99adbabb56/FZOLSnkBj_xPbaNQBqbU5.png)
|
21 |
+
|
22 |
A Vision Encoder-Decoder model that doesn’t just caption images but generates questions and possible answers based on what it “sees.” Using ViT as the encoder and BART as the decoder, it’s built for image-based QA without the fluff.
|
23 |
|
24 |
Translation: feed it an image, and get back a useful question-answer pair. Perfect for creating and synthesizing data in image QA tasks. It’s one model, two tasks, and a lot of potential!
|