Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models Paper • 2402.16438 • Published Feb 26
AtP*: An efficient and scalable method for localizing LLM behaviour to components Paper • 2403.00745 • Published Mar 1 • 11
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral Paper • 2403.01851 • Published Mar 4
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6 • 62
ROME: Memorization Insights from Text, Probability and Hidden State in Large Language Models Paper • 2403.00510 • Published Mar 1
Large Language Models Struggle to Learn Long-Tail Knowledge Paper • 2211.08411 • Published Nov 15, 2022 • 3
How Do Large Language Models Acquire Factual Knowledge During Pretraining? Paper • 2406.11813 • Published Jun 17 • 30
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models Paper • 2406.12649 • Published Jun 18 • 15
Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published 21 days ago • 16
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse Paper • 2410.21333 • Published 18 days ago • 9