-
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing
Paper • 2311.00571 • Published • 41 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 48 -
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper • 2310.08166 • Published • 1 -
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants
Paper • 2310.00653 • Published • 3
Joy Rimchala
joytafty
AI & ML interests
NER
Organizations
Collections
3
models
3
datasets
6
joytafty/icdar2023vqabd-small-tables-val
Viewer
•
Updated
•
19
•
30
joytafty/icdar2023vqabd-small-tables-train
Viewer
•
Updated
•
244
•
31
joytafty/denoising-dirty-documents-test
Viewer
•
Updated
•
72
•
35
joytafty/denoising-dirty-documents-train
Viewer
•
Updated
•
144
•
39
joytafty/denoising-dirty-documents-trained_cleaned
Viewer
•
Updated
•
144
•
35
joytafty/denoising-dirty-documents-cleaned
Viewer
•
Updated
•
144
•
33