Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 18 days ago • 340
limuyu011/mc-sft-llava_next_8b-mcqa_v3_12_25_277k-12_13-A100-c8-e1-b8-a1 Image-Text-to-Text • Updated Dec 16, 2024 • 18
limuyu011/mc-vsft-sft-llama3_llava_next_8b-mcvqa_v4_11_21_80k-12-15-A100-c8-e1-b4-a4 Image-Text-to-Text • Updated Dec 16, 2024 • 1
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting Paper • 2410.17856 • Published Oct 23, 2024 • 49