DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations Paper • 2410.18860 • Published Oct 24, 2024 • 9
Analysing the Residual Stream of Language Models Under Knowledge Conflicts Paper • 2410.16090 • Published Oct 21, 2024 • 7
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering Paper • 2410.15999 • Published Oct 21, 2024 • 19
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9, 2024 • 38
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21, 2024 • 25
🔍 Daily Picks in Interpretability & Analysis of LMs Collection Outstanding research in interpretability and evaluation of language models, summarized • 93 items • Updated about 15 hours ago • 95
view article Article Introducing RWKV — An RNN with the advantages of a transformer May 15, 2023 • 14
Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation Paper • 2406.13663 • Published Jun 19, 2024 • 7
A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression Paper • 2406.11430 • Published Jun 17, 2024 • 22
view article Article The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models Jan 29, 2024 • 17