Steered Llama-v2-7b towards Effective Arguments for Liberal Readers
This is the steered Llama-v2-7b-chat-hf model.
We used the processed debateorg dataset to create the steering vectors:
- We first extracted the hidden layers of effective arguments and ineffective arguments.
- For each layer, from 18-20,
- we calculate the median of the hidden vectors.
- We substract the median of effective arguments from the median of ineffective arguments
- We add the result to each corresponding activation layer
- Downloads last month
- 24
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.