|
--- |
|
license: other |
|
license_name: other |
|
license_link: LICENSE |
|
datasets: |
|
- adamo1139/rawrr_v2 |
|
- adamo1139/AEZAKMI_v3-6 |
|
- unalignment/toxic-dpo-v0.1 |
|
--- |
|
## Model description |
|
|
|
Yi-34B 200K XLCTX base model fine-tuned on RAWrr_v2 (DPO), AEZAKMI-3-6 (SFT) and unalignment/toxic-dpo-0.1 (DPO) datasets. Training took around 20-30 hours total on RTX 3090 Ti, all finetuning was done locally. |
|
It's like airoboros but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models, with extra spicyness. |
|
Say goodbye to "It's important to remember"! \ |
|
Prompt format is standard chatml. Don't expect it to be good at math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot. |
|
Cost of this fine-tune is about $5-$10 in electricity. |
|
Base model used for fine-tuning was Yi-34B-200K model shared by 01.ai, the newer version that has improved long context needle in a haystack retrieval. They didn't give it a new name, giving it numbers would mess up AEZAKMI naming scheme by adding a second number, so I will be calling it XLCTX. |
|
|
|
|
|
I had to lower max_positional_embeddings in config.json and model_max_length for training to start, otherwise I was OOMing straight away. |
|
This attempt had both max_position_embeddings and model_max_length set to 4096, which worked perfectly fine. I then reversed this to 200000 once I was uploading it. |
|
I think it should keep long context capabilities of the base model. |
|
|
|
In my testing it seems less unhinged than adamo1139/Yi-34b-200K-AEZAKMI-RAW-TOXIC-2702 and maybe a touch less uncensored, but still very much uncensored even with default system prompt "A chat." |
|
If you want to see training scripts, let me know and I will upload them. LoRAs are uploaded [here adamo1139/Yi-34B-200K-AEZAKMI-XLCTX-v3-LoRA](https://huggingface.co/adamo1139/Yi-34B-200K-AEZAKMI-XLCTX-v3-LoRA) |
|
|
|
## Quants! |
|
|
|
EXL2 quants coming soon, I think I will start by uploading 4bpw quant in a few days. |
|
|
|
|
|
## Prompt Format |
|
|
|
I recommend using ChatML format, as this was used during fine-tune. \ |
|
Here's a prompt format you should use, you can set a different system message, model was trained on SystemChat dataset, so it should respect system prompts fine. |
|
|
|
``` |
|
<|im_start|>system |
|
A chat.<|im_end|> |
|
<|im_start|>user |
|
{prompt}<|im_end|> |
|
<|im_start|>assistant |
|
``` |
|
|
|
## Intended uses & limitations |
|
|
|
Use is limited by Yi license. \ |
|
Some datasets that were used prohibit commercial use, so please use non-commercially only. |
|
|
|
## Known Issues |
|
|
|
It's more of an assistant feel rather than a human feel, at least with system chat "A chat." \ |
|
Long context wasn't tested yet, it should work fine though - feel free to give me feedback about it. |
|
|
|
## Credits |
|
|
|
Thanks to unsloth and huggingface team for providing software packages used during fine-tuning. \ |
|
Thanks to Jon Durbin, abacusai, huggingface, sandex, NobodyExistsOnTheInternet, Nous-Research for open sourcing datasets I included in the AEZAKMI dataset. \ |
|
AEZAKMI is basically a mix of open source datasets I found on HF, so without them this would not be possible at all. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" alt="made with Unsloth" width="400" height="64"/>](https://github.com/unslothai/unsloth) |
|
|
|
|