HV-Khurdula commited on
Commit
d84472f
·
verified ·
1 Parent(s): e50c6a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -16,6 +16,9 @@ tags:
16
 
17
  # Dua-Vision-Base
18
 
 
 
 
19
  A Vision Encoder-Decoder model that doesn’t just caption images but generates questions and possible answers based on what it “sees.” Using ViT as the encoder and BART as the decoder, it’s built for image-based QA without the fluff.
20
 
21
  Translation: feed it an image, and get back a useful question-answer pair. Perfect for creating and synthesizing data in image QA tasks. It’s one model, two tasks, and a lot of potential!
 
16
 
17
  # Dua-Vision-Base
18
 
19
+
20
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f0cf1adcac1f99adbabb56/FZOLSnkBj_xPbaNQBqbU5.png)
21
+
22
  A Vision Encoder-Decoder model that doesn’t just caption images but generates questions and possible answers based on what it “sees.” Using ViT as the encoder and BART as the decoder, it’s built for image-based QA without the fluff.
23
 
24
  Translation: feed it an image, and get back a useful question-answer pair. Perfect for creating and synthesizing data in image QA tasks. It’s one model, two tasks, and a lot of potential!