--- title: Activate Love emoji: ❤️ colorFrom: purple colorTo: red sdk: gradio sdk_version: 4.31.5 app_file: app.py pinned: true license: mit short_description: Steering AI Text Generation --- # Activate Love ❤️ A [Gradio App][gradio-url] replicating results of the paper [»Activation Addition: Steering Language Models Without Optimization«][paper-url] on a [Hugging Face Space][hugging-face-spaces-url]. ## Demo Check it out https://huggingface.co/spaces/janraasch/activate-love 🎯. ## Raison d'être This is my final project for the [AI Safety Fundamentals][ai-safety-fundamentals-url] course on [AI Alignment][ai-safety-fundamentals-alignment-url]. When we covered the topic of *Mechanistic Interpretability* in session six my cohort's instructor mentioned [the paper on activation addition][paper-url] published in late 2023. I found this to be an enjoyable & interesting way to get to play around with the inner workings of a model w/o training/optimization. The authors kindly provide [a notebook on Google Colab][notebook-url] for everyone to replicate their results. Still, I felt it to be useful to give an even more user-friendly & non-technical interface to lower the barrier to interaction with these low-level workings of the model. Hence this https://huggingface.co/spaces/janraasch/activate-love app exists such that *everyone* may steer and play with [GPT-2 XL][gpt2-xl-url]. ## Development ```bash # Create virtual environment python3 -m venv gradio-env source gradio-env/bin/activate # Install dependencies pip install -r requirements.txt # Run app locally gradio app.py ``` ## License [MIT License](https://en.wikipedia.org/wiki/MIT_License) © [Jan Raasch](https://www.janraasch.com) [ai-safety-fundamentals-alignment-url]: https://aisafetyfundamentals.com/alignment [ai-safety-fundamentals-url]: https://aisafetyfundamentals.com [gpt2-xl-url]:https://huggingface.co/openai-community/gpt2-xl [gradio-url]: https://www.gradio.app [hugging-face-spaces-url]: https://huggingface.co/spaces/launch [paper-url]: https://arxiv.org/abs/2308.10248 [notebook-url]: http://tinyurl.com/actadd