Ο_0: A Vision-Language-Action Flow Model for General Robot Control Paper β’ 2410.24164 β’ Published Oct 31, 2024 β’ 6
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations Paper β’ 2412.07626 β’ Published Dec 10, 2024 β’ 22
Cosmos Tokenizer Collection A suite of image and video tokenizers β’ 13 items β’ Updated about 21 hours ago β’ 40
AMD-OLMo Collection AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinctβ’ MI250 GPUs based on OLMo. β’ 4 items β’ Updated Oct 31, 2024 β’ 18
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 16 items β’ Updated Feb 20 β’ 251
C4AI Aya Expanse Collection Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. β’ 4 items β’ Updated 25 days ago β’ 38
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28, 2024 β’ 201
Awesome Document AI Collection A collection of open-source document AI π π π β’ 27 items β’ Updated Mar 11, 2024 β’ 80
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding Paper β’ 2407.12594 β’ Published Jul 17, 2024 β’ 19
view article Article Llama can now see and run on your device - welcome Llama 3.2 Sep 25, 2024 β’ 185