Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published 11 days ago • 33
Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives Paper • 2404.11317 • Published Apr 17, 2024 • 1
Improving the Consistency in Cross-Lingual Cross-Modal Retrieval with 1-to-K Contrastive Learning Paper • 2406.18254 • Published Jun 26, 2024 • 1
EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations Paper • 2410.10315 • Published Oct 14, 2024 • 2
When Text Embedding Meets Large Language Model: A Comprehensive Survey Paper • 2412.09165 • Published 23 days ago • 1
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models Paper • 2410.09732 • Published Oct 13, 2024 • 54
GTE models Collection General Text Embedding Models Released by Tongyi Lab of Alibaba Group • 19 items • Updated 15 days ago • 19
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 22 days ago • 143
view article Article GaLore: Advancing Large Model Training on Consumer-grade Hardware Mar 20, 2024 • 26
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6, 2024 • 91