Text Generation
Transformers
Safetensors
llama
Inference Endpoints
text-generation-inference
adamo1139 commited on
Commit
359db93
1 Parent(s): 64f1296

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -22,9 +22,9 @@ To get this model, first, I fine-tuned Yi-34B-200K (xlctx, as in second version
22
  Once I had good base model, I fine-tuned it on [HESOYAM 0.2](https://huggingface.co/datasets/adamo1139/HESOYAM_v0.2) dataset. It's a collection of single turn conversations from around 10 subreddits and multi-turn conversations from board /x/. There's also pippa in there. All samples there have system prompts that should tell the model about where discussion is taking place, this will be useful when you will be deciding on where you want to have your sandbox discussion take place. Here, I used classic SFT with GaLore and Unsloth, I wanted to get some results quick so it's trained for just 0.4 epochs. Adapter after that part of fine-tuning can be found [here](https://huggingface.co/adamo1139/Yi-34B-200K-XLCTX-HESOYAM-RAW-0905-GaLore-PEFT).
23
 
24
 
 
25
 
26
-
27
-
28
 
29
 
30
  ## Prompt template
 
22
  Once I had good base model, I fine-tuned it on [HESOYAM 0.2](https://huggingface.co/datasets/adamo1139/HESOYAM_v0.2) dataset. It's a collection of single turn conversations from around 10 subreddits and multi-turn conversations from board /x/. There's also pippa in there. All samples there have system prompts that should tell the model about where discussion is taking place, this will be useful when you will be deciding on where you want to have your sandbox discussion take place. Here, I used classic SFT with GaLore and Unsloth, I wanted to get some results quick so it's trained for just 0.4 epochs. Adapter after that part of fine-tuning can be found [here](https://huggingface.co/adamo1139/Yi-34B-200K-XLCTX-HESOYAM-RAW-0905-GaLore-PEFT).
23
 
24
 
25
+ [Conversation samples](https://huggingface.co/datasets/adamo1139/misc/blob/main/benchmarks/yi-34b-200k-xlctx-hesoyam-raw-0905/hesoyam_0905_samples.txt) - I put in a seed prompt and let the model generate the rest of the conversation.
26
 
27
+ [Results on my base benchmarks](https://huggingface.co/datasets/adamo1139/misc/blob/main/benchmarks/yi-34b-200k-xlctx-hesoyam-raw-0905/benchmark_prompts.txt) - Responses suggests it still has some general assistant capabilities. I don't really want that, maybe I should up the learning rate for next run so that it stays in character more.
 
28
 
29
 
30
  ## Prompt template