Edit model card

Collection of LoRA files for my finetune. First, there was DPO on adamo1139/rawrr_v2-2_stage1 for 1 epoch. Then, SFT training on adamo1139/AEZAKMI_v3-7 for 0.5 epochs. Then, ORPO training on adamo1139/toxic-dpo-natural-v5 for 1 epoch.
I like the resulting model so far, it does feel very natural and uncensored, ORPO training really turned out nicely and was also super quick! Exciting for models in the future for sure!

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Unable to determine this model's library. Check the docs .

adamo1139
/

yi-34b-200k-xlctx-aezakmi-raw-toxic-dpo-sft-orpo-lora-0205

Datasets used to train adamo1139/yi-34b-200k-xlctx-aezakmi-raw-toxic-dpo-sft-orpo-lora-0205