Spaces:
Running
Running
title: Activate Love | |
emoji: ❤️ | |
colorFrom: purple | |
colorTo: red | |
sdk: gradio | |
sdk_version: 4.31.5 | |
app_file: app.py | |
pinned: true | |
license: mit | |
short_description: Steering AI Text Generation | |
# Activate Love ❤️ | |
A [Gradio App][gradio-url] replicating results of the paper [»Activation Addition: Steering Language Models Without Optimization«][paper-url] on a [Hugging Face Space][hugging-face-spaces-url]. | |
## Demo | |
Check it out https://huggingface.co/spaces/janraasch/activate-love 🎯. | |
## Raison d'être | |
This is my final project for the [AI Safety Fundamentals][ai-safety-fundamentals-url] course on [AI Alignment][ai-safety-fundamentals-alignment-url]. | |
When we covered the topic of *Mechanistic Interpretability* in session six my cohort's instructor mentioned [the paper on activation addition][paper-url] published in late 2023. I found this to be an enjoyable & interesting way to get to play around with the inner workings of a model w/o training/optimization. | |
The authors kindly provide [a notebook on Google Colab][notebook-url] for everyone to replicate their results. Still, I felt it to be useful to give an even more user-friendly & non-technical interface to lower the barrier to interaction with these low-level workings of the model. | |
Hence this https://huggingface.co/spaces/janraasch/activate-love app exists such that *everyone* may steer and play with [GPT-2 XL][gpt2-xl-url]. | |
## Development | |
```bash | |
# Create virtual environment | |
python3 -m venv gradio-env | |
source gradio-env/bin/activate | |
# Install dependencies | |
pip install -r requirements.txt | |
# Run app locally | |
gradio app.py | |
``` | |
## License | |
[MIT License](https://en.wikipedia.org/wiki/MIT_License) © [Jan Raasch](https://www.janraasch.com) | |
[ai-safety-fundamentals-alignment-url]: https://aisafetyfundamentals.com/alignment | |
[ai-safety-fundamentals-url]: https://aisafetyfundamentals.com | |
[gpt2-xl-url]:https://huggingface.co/openai-community/gpt2-xl | |
[gradio-url]: https://www.gradio.app | |
[hugging-face-spaces-url]: https://huggingface.co/spaces/launch | |
[paper-url]: https://arxiv.org/abs/2308.10248 | |
[notebook-url]: http://tinyurl.com/actadd | |