Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ base_model:
|
|
10 |
|
11 |
Official code and weights for the Paper [**Scar: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs**](https://arxiv.org/abs/2411.07122). The code is located in this [Repository](https://github.com/ml-research/SCAR).
|
12 |
|
13 |
-
This repo contains the code to apply supervised SAEs
|
14 |
|
15 |
# Usage
|
16 |
|
|
|
10 |
|
11 |
Official code and weights for the Paper [**Scar: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs**](https://arxiv.org/abs/2411.07122). The code is located in this [Repository](https://github.com/ml-research/SCAR).
|
12 |
|
13 |
+
This repo contains the code to apply supervised SAEs to LLMs. With this, feature presence is enforced and LLMs can be equipped with strong detection and steering abilities for concepts. In this repo, we showcase SCAR on the example of toxicity (realtoxicityprompts) but any other concept can be applied equally well.
|
14 |
|
15 |
# Usage
|
16 |
|