Gabriele Sarti's picture

Gabriele Sarti

gsarti

·

https://gsarti.com

AI & ML interests

Interpretability for generative language models

Recent Activity

liked a model 2 days ago

Butanium/gemma-2-2b-crosscoder-l13-mu4.1e-02-lr1e-04

updated a dataset 2 days ago

gsarti/qe4pe

commented a paper 2 days ago

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

View all activity

Organizations

gsarti's activity

upvoted 2 papers 2 days ago

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Paper • 2411.14257 • Published 3 days ago • 8

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published 5 days ago • 2

upvoted a paper 3 days ago

Controllable Context Sensitivity and the Knob Behind It

Paper • 2411.07404 • Published 13 days ago • 1

upvoted 2 papers 7 days ago

Features that Make a Difference: Leveraging Gradients for Improved Dictionary Learning

Paper • 2411.10397 • Published 9 days ago • 1

Counterfactual Generation from Language Models

Paper • 2411.07180 • Published 13 days ago • 5

upvoted a collection 24 days ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 10 items • Updated 3 days ago • 177

upvoted a paper 25 days ago

The Geometry of Concepts: Sparse Autoencoder Feature Structure

Paper • 2410.19750 • Published Oct 10 • 1

upvoted 2 papers 26 days ago

Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders

Paper • 2410.20526 • Published 28 days ago • 1

Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics

Paper • 2410.21272 • Published 27 days ago • 1

upvoted a paper 27 days ago

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21 • 19

upvoted 3 papers about 1 month ago

Automatically Interpreting Millions of Features in Large Language Models

Paper • 2410.13928 • Published Oct 17 • 1

Decomposing The Dark Matter of Sparse Autoencoders

Paper • 2410.14670 • Published Oct 18 • 1

How Do Multilingual Models Remember? Investigating Multilingual Factual Recall Mechanisms

Paper • 2410.14387 • Published Oct 18 • 1

upvoted 4 papers about 2 months ago

Towards Interpreting Visual Information Processing in Vision-Language Models

Paper • 2410.07149 • Published Oct 9 • 1

What Matters for Model Merging at Scale?

Paper • 2410.03617 • Published Oct 4 • 8

Geometric Signatures of Compositionality Across a Language Model's Lifetime

Paper • 2410.01444 • Published Oct 2 • 1

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21 • 27

upvoted a collection 2 months ago

ITA-Bench: Italian Benchmarks for LLMs

A collection of Italian benchmarks for Large Language Models. See also our Github repo: https://github.com/SapienzaNLP/ita-bench • 19 items • Updated Sep 23 • 6

upvoted a paper 2 months ago

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders

Paper • 2409.14507 • Published Sep 22 • 1

upvoted an article 3 months ago

Article

Selective fine-tuning of Language Models with Spectrum

By

•

Sep 3

• 29