SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training Paper • 2412.09619 • Published 26 days ago • 20
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22, 2024 • 127
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 101
RoLoRA Collection [EMNLP2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization • 3 items • Updated Sep 26, 2024 • 3
RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Paper • 2407.08044 • Published Jul 10, 2024 • 1
LLM-FP4: 4-Bit Floating-Point Quantized Transformers Paper • 2310.16836 • Published Oct 25, 2023 • 13
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 243