EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation Paper • 2411.08380 • Published 6 days ago • 23
Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents Paper • 2410.13185 • Published Oct 17 • 6
LLaVA-Video Collection Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 6 items • Updated Oct 5 • 53
LLaVA-Critic Collection as a general evaluator for assessing model performance • 6 items • Updated Oct 6 • 8
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published 28 days ago • 88
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 160
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages Paper • 2407.19672 • Published Jul 29 • 55