adamo1139
/

Yi-34B-200K-HESOYAM-0905

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

adamo1139 commited on May 19

Commit

359db93

•

1 Parent(s): 64f1296

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -22,9 +22,9 @@ To get this model, first, I fine-tuned Yi-34B-200K (xlctx, as in second version
 Once I had good base model, I fine-tuned it on [HESOYAM 0.2](https://huggingface.co/datasets/adamo1139/HESOYAM_v0.2) dataset. It's a collection of single turn conversations from around 10 subreddits and multi-turn conversations from board /x/. There's also pippa in there. All samples there have system prompts that should tell the model about where discussion is taking place, this will be useful when you will be deciding on where you want to have your sandbox discussion take place. Here, I used classic SFT with GaLore and Unsloth, I wanted to get some results quick so it's trained for just 0.4 epochs. Adapter after that part of fine-tuning can be found [here](https://huggingface.co/adamo1139/Yi-34B-200K-XLCTX-HESOYAM-RAW-0905-GaLore-PEFT).
 ## Prompt template

 Once I had good base model, I fine-tuned it on [HESOYAM 0.2](https://huggingface.co/datasets/adamo1139/HESOYAM_v0.2) dataset. It's a collection of single turn conversations from around 10 subreddits and multi-turn conversations from board /x/. There's also pippa in there. All samples there have system prompts that should tell the model about where discussion is taking place, this will be useful when you will be deciding on where you want to have your sandbox discussion take place. Here, I used classic SFT with GaLore and Unsloth, I wanted to get some results quick so it's trained for just 0.4 epochs. Adapter after that part of fine-tuning can be found [here](https://huggingface.co/adamo1139/Yi-34B-200K-XLCTX-HESOYAM-RAW-0905-GaLore-PEFT).
+[Conversation samples](https://huggingface.co/datasets/adamo1139/misc/blob/main/benchmarks/yi-34b-200k-xlctx-hesoyam-raw-0905/hesoyam_0905_samples.txt) - I put in a seed prompt and let the model generate the rest of the conversation.
+[Results on my base benchmarks](https://huggingface.co/datasets/adamo1139/misc/blob/main/benchmarks/yi-34b-200k-xlctx-hesoyam-raw-0905/benchmark_prompts.txt) - Responses suggests it still has some general assistant capabilities. I don't really want that, maybe I should up the learning rate for next run so that it stays in character more.
 ## Prompt template