Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 77
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 186
Qwen 2.5 Coder Collection Complete collection of Code-specific model series for Qwen2.5 in bnb 4bit, 16bit and GGUF formats. • 35 items • Updated 6 days ago • 27
Qwen QwQ-32B Collection Collection Qwen's reasoning models including QwQ (32B) & QVQ (72B) in formats: GGUF, dynamic 4-bit and 16-bit original versions. • 13 items • Updated 6 days ago • 3
Phi-4 (All Versions) Collection Microsoft's new Phi-4 models including mini in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 8 items • Updated 6 days ago • 46
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 6 days ago • 210
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 26 items • Updated 4 days ago • 62
Personal Favorites Collection Recommended models I use often or like for any reason. I recommend reading their cards for more details. • 10 items • Updated Dec 24, 2024 • 81
🚂 SD-XL Training Suite Collection All the steps to train your own SD-XL custom model • 9 items • Updated Feb 14 • 22
abliterated-v3 Collection Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3, 2024 • 115
Don't tell me no... Collection Models designed to provide fewer refusals. • 3 items • Updated Apr 19, 2024 • 4
Flavors of Flora Collection A collection of the different flavors of Flora. • 3 items • Updated Apr 19, 2024 • 2