Neel Nanda's picture

3 2 7

Neel Nanda

NeelNanda

·

https://neelnanda.io

AI & ML interests

Mechanistic Interpretability

Recent Activity

authored a paper about 1 month ago

Open Problems in Mechanistic Interpretability

authored a paper 4 months ago

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

updated a model 4 months ago

NeelNanda/crosscoders-gpt2-small

View all activity

Organizations

NeelNanda's activity

upvoted a paper about 1 year ago

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1, 2024 • 14

upvoted a paper over 1 year ago

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Paper • 2307.09458 • Published Jul 18, 2023 • 11