SAELens
File size: 1,974 Bytes
016b363
f8afffd
068c017
016b363
 
617c6ee
2159c7f
67b2a69
016b363
bd3fe55
67b2a69
bd3fe55
67b2a69
884074a
016b363
bd3fe55
016b363
 
0127b34
016b363
db1f4e0
 
 
 
 
63d9867
 
 
 
 
 
 
 
 
 
 
 
 
389d928
 
 
63d9867
 
016b363
 
 
 
bd3fe55
016b363
 
 
 
 
 
2232397
63d9867
2232397
0127b34
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
license: cc-by-4.0
library_name: saelens
---

⚠️ WARNING: We have small labelling issues, and some SAEs appear twice in this repo.

# 1. Gemma Scope

Gemma Scope is a comprehensive, open suite of sparse autoencoders for Gemma 2 9B and 2B. Sparse Autoencoders are a "microscope" of sorts that can help us break down a model’s internal activations into the underlying concepts, just as biologists use microscopes to study the individual cells of plants and animals.

See our [landing page](https://huggingface.co/google/gemma-scope) for details on the whole suite. This is a specific set of SAEs:

# 2. What Is `gemma-scope-2b-pt-res`?

- `gemma-scope-`: See 1.
- `2b-pt-`: These SAEs were trained on Gemma v2 2B base model.
- `res`: These SAEs were trained on the model's residual stream.
- We include experimental SAEs trained on token embeddings in the ./embedding folder. 


# 3. Which SAE is in the [Neuronpedia demo](https://www.neuronpedia.org/gemma-scope)?

https://huggingface.co/google/gemma-scope-2b-pt-res/tree/main/layer_20/width_16k/average_l0_71

See also 4.:

# 4. How can I use these SAEs straight away?

```python
from sae_lens import SAE  # pip install sae-lens

sae, cfg_dict, sparsity = SAE.from_pretrained(
    release = "gemma-scope-2b-pt-res-canonical",
    sae_id = "layer_0/width_16k/canonical",
)
```

This uses **canonical** SAEs, those with average L0 closest to 100, which we expect to be reasonably useful for most tasks. The exact defined here is determined by this file in the SAELens repo, snappshotted on 22nd October 2024: https://github.com/jbloomAus/SAELens/blob/a470460/sae_lens/pretrained_saes.yaml#L1667

See https://github.com/jbloomAus/SAELens for more details on this library.

# 5. Point of Contact

Point of contact: Arthur Conmy

Contact by email:

```python
''.join(list('moc.elgoog@ymnoc')[::-1])
```

HuggingFace account:
https://huggingface.co/ArthurConmyGDM

# 6. Citation

Paper: https://arxiv.org/abs/2408.05147