AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. β’ 19 items β’ Updated about 9 hours ago β’ 6
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. β’ 32 items β’ Updated about 15 hours ago β’ 13
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. β’ 7 items β’ Updated about 15 hours ago β’ 10
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper β’ 2411.10640 β’ Published 6 days ago β’ 38
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. β’ 7 items β’ Updated 3 days ago β’ 37
π Daily Picks in Interpretability & Analysis of LMs Collection Outstanding research in interpretability and evaluation of language models, summarized β’ 83 items β’ Updated about 8 hours ago β’ 91
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper β’ 2407.03502 β’ Published Jul 3 β’ 48
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 β’ 40 items β’ Updated 4 days ago β’ 227
π Ichigo v0.3 Collection The experimental family designed to train LLMs to understand sound natively. β’ 6 items β’ Updated 11 days ago β’ 17
llama.vim Collection Recommended models for the llama.vim plugin β’ 3 items β’ Updated 4 days ago β’ 3
view article Article Recipe: Preparing Multilingual Speech Datasets for TTS Training By PHBJT β’ 18 days ago β’ 14
AMD-OLMo Collection AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinctβ’ MI250 GPUs based on OLMo. β’ 4 items β’ Updated 21 days ago β’ 16
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis Paper β’ 2410.23320 β’ Published 23 days ago β’ 6
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 β’ 8 items β’ Updated 15 days ago β’ 95
LayerSkip Collection Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 β’ 8 items β’ Updated about 18 hours ago β’ 43