LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Paper • 2310.08659 • Published Oct 12, 2023 • 24
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper • 2405.18392 • Published May 28 • 12
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 96
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 16 days ago • 695
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 170
view article Article Overview of natively supported quantization schemes in 🤗 Transformers Sep 12, 2023 • 11
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes Aug 17, 2022 • 66
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 • 100
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval Mar 22 • 69