Should We Really Edit Language Models? On the Evaluation of Edited Language Models Paper • 2410.18785 • Published Oct 24 • 5
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Paper • 2410.10819 • Published Oct 14 • 6
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs Paper • 2410.05265 • Published Oct 7 • 29
view article Article LLM Data Engineering 3——Data Collection Magic: Acquiring Top Training Data By JessyTsu1 • Jun 4 • 4
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 203
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models Paper • 2406.02924 • Published Jun 5 • 2
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 • 96
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5 • 14
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 217