Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published Dec 5, 2024 • 59
view article Article ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models By ahmed-masry • Oct 18, 2024 • 16