Quantization、Prunning、Distillation
-
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Paper • 2407.11062 • Published • 8 -
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs
Paper • 2410.05265 • Published • 29 -
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Paper • 2308.13137 • Published • 17