grimjim
/

kukulemon-7B-8.0bpw_h8_exl2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

grimjim commited on Mar 14

Commit

99dbcd2

•

1 Parent(s): bdc8bef

Update README.md

Updated with links to GGUF and fp16.

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -12,7 +12,11 @@ license: cc-by-nc-4.0
 This is an 8.0bpw h8 exl2 quant of a merger of two similar models with strong reasoning, hopefully resulting in "dense" encoding of said reasoning, was merged with a model targeting roleplay.
-I've tested with ChatML prompts with temperature=1.1 and minP=0.03. The model itself supports Alpaca format prompts. The model claims a context length of 32K, but I've only found it stable up to 8K in testing. I recommend sticking with 8.0bpw h8 exl2 or Q8_0 GGUF, to maintain coherence.
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

 This is an 8.0bpw h8 exl2 quant of a merger of two similar models with strong reasoning, hopefully resulting in "dense" encoding of said reasoning, was merged with a model targeting roleplay.
+I've tested with ChatML prompts with temperature=1.1 and minP=0.03. The model itself supports Alpaca format prompts. The model claims a context length of 32K, but I found it lost coherence after 8K in informal testing. I prefer to stick with 8.0bpw h8 exl2 or Q8_0 GGUF for maximum coherence.
+Alternative downloads:
+[GGUF quants courtesy of Lewdiculous](https://huggingface.co/Lewdiculous/kukulemon-7B-GGUF-IQ-Imatrix)
+[fp16 safetensors](https://huggingface.co/grimjim/kukulemon-7B)
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).