Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 77
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 185
Qwen 2.5 Coder Collection Complete collection of Code-specific model series for Qwen2.5 in bnb 4bit, 16bit and GGUF formats. • 35 items • Updated 14 days ago • 25
Qwen QVQ + QwQ Collection Collection Qwen's reasoning models including QVQ (72B) & QwQ (32B) in formats: GGUF, 4-bit bnb and 16-bit original versions. • 6 items • Updated 6 days ago • 2
Phi-4 (All Versions) Collection Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 4 items • Updated 14 days ago • 41
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 10 days ago • 184
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 22 items • Updated 2 days ago • 46
Personal Favorites Collection Recommended models I use often or like for any reason. I recommend reading their cards for more details. • 10 items • Updated Dec 24, 2024 • 80
🚂 SD-XL Training Suite Collection All the steps to train your own SD-XL custom model • 9 items • Updated 4 days ago • 21
abliterated-v3 Collection Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3, 2024 • 111
Don't tell me no... Collection Models designed to provide fewer refusals. • 3 items • Updated Apr 19, 2024 • 4
Flavors of Flora Collection A collection of the different flavors of Flora. • 3 items • Updated Apr 19, 2024 • 2