OAS before finetuning?

#1
by zypcastles - opened

Is the model finetuned before orthogonalization, or vice versa? Is orthogonalization before finetuning more reasonable?

NeverSleep org

It was done after (so on top of our finetuning)
I think it's better to do it after than before but I didn't have 2 exact model with OAS before and after so I can't be 100% sure, still, using the same dataset and the same script with same config on Lumimaid 8B and Llama3-8B-Instruct give me two different result, Lumimaid get more uncensored than base, and I suppose it's because we already uncensored a ton with our training data since it's based on ERP haha

Thanks for explanations! However, my wild guess is that the gradients will be focused more on the contradictions between the model norms and the finetune dataset such as the mediating refusals, so the learning to the knowledge of roleplay would be disturbed even though refusals will be compromised. Doing OAS before finetuning may help the model focus on improvement during finetuning. Actually I found all previous "uncensored" models degraded from the original ones on capacities like logic.

Another wild guess is that OAS is like a surgery, and the LLM need recovery after surgery. Llama-3 70b with OAS only is a quite capable model to start with. However I really don't know if it will bring side-effect for finetuning as I don't have large GPUs to test, and hope someone could test it someday.

Sign up or log in to comment