Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes? Mar 5, 2024 • 4
hbXNov/llama3.1-8b_train_gpt_4o_verifications_e3_lr5e-7-add-special-true-len3072-19233-merged Updated 22 days ago • 6
hbXNov/llama3.1-8b_train_gpt_4o_verifications_e3_lr5e-7-add-special-true-31389-merged Updated 23 days ago • 8