Collection of LoRA files for my finetune. First, there was DPO on adamo1139/rawrr_v2-2_stage1 for 1 epoch. Then, SFT training on adamo1139/AEZAKMI_v3-7 for 0.5 epochs. Then, ORPO training on adamo1139/toxic-dpo-natural-v5 for 1 epoch.
I like the resulting model so far, it does feel very natural and uncensored, ORPO training really turned out nicely and was also super quick! Exciting for models in the future for sure!