Atticus Geiger's picture

Atticus Geiger

atticusg

https://atticusg.github.io/

atticusg

AI & ML interests

NLP, explainable AI, causality

Recent Activity

updated a dataset 6 days ago

atticusg/ravel

View all activity

Organizations

atticusg's activity

updated a dataset 6 days ago

atticusg/ravel

Preview • Updated 6 days ago

authored a paper 3 months ago

Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations

Paper • 2408.10920 • Published Aug 20 • 1

authored 12 papers 7 months ago

DynaSent: A Dynamic Benchmark for Sentiment Analysis

Paper • 2012.15349 • Published Dec 30, 2020

ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning

Paper • 2305.19426 • Published May 30, 2023

CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior

Paper • 2205.14140 • Published May 27, 2022

Rigorously Assessing Natural Language Explanations of Neurons

Paper • 2309.10312 • Published Sep 19, 2023

Linear Representations of Sentiment in Large Language Models

Paper • 2310.15154 • Published Oct 23, 2023

Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation

Paper • 2004.14623 • Published Apr 30, 2020

A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments

Paper • 2401.12631 • Published Jan 23

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 91

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Paper • 2403.07809 • Published Mar 12 • 1

Causal Abstraction for Faithful Model Interpretation

Paper • 2301.04709 • Published Jan 11, 2023

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

Paper • 2303.02536 • Published Mar 5, 2023 • 1

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

Paper • 2402.17700 • Published Feb 27 • 2

authored 2 papers over 1 year ago

Causal Proxy Models for Concept-Based Model Explanations

Paper • 2209.14279 • Published Sep 28, 2022

Interpretability at Scale: Identifying Causal Mechanisms in Alpaca

Paper • 2305.08809 • Published May 15, 2023 • 2