Unifying Vision, Text, and Layout for Universal Document Processing Paper โข 2212.02623 โข Published Dec 5, 2022 โข 10
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions Paper โข 2308.09936 โข Published Aug 19, 2023 โข 1