RossAscends
/

Mistral7B_Dolphin2.1_LIMARP0.5_4bpw_exl2

Text Generation

Model card Files Files and versions Community

RossAscends commited on Oct 28, 2023

Commit

865d43c

·

1 Parent(s): 129b6fd

Update README.md

Files changed (1) hide show

README.md +13 -3

README.md CHANGED Viewed

@@ -5,11 +5,21 @@ language: en
 library_name: adapter-transformers
 pipeline_tag: text-generation
 ---
-This is a simple 0.5 merge of LIMA RP lora with ehartford's Mistral Dolphin 2.1
-exllama v2 at 4bpw
-Use SilyTavern's ChatML or Mistral instruct formats for best instruct-style results.
 full weights:
 https://huggingface.co/RossAscends/Mistral_7B_Dolphin2.1_LIMA0.5_fp16

 library_name: adapter-transformers
 pipeline_tag: text-generation
 ---
+ehartford's merge of Mistral 7B 0.1 with his Dolphin 2.1 dataset
+https://huggingface.co/ehartford/dolphin-2.1-mistral-7b
++
+LIMA RP dataset applied as a lora at 0.5 weight
+https://huggingface.co/lemonilia/limarp-llama2-v2/
+Purpose of the model is to be RP-focused, smart, fast, and lightweight for users with low VRAM.
+I've already built the exl2 4bpw quant (linked below), and it will run 8k ctx at around 6GB VRAM and respond to a full context at roughly 30tps (tested on my 3060) if exl2_hf loader is used with FA2 enabled.
+Model has been tested by several users on the SillyTavern discord server, and run on Horde for a full day - with good results.
+https://huggingface.co/RossAscends/Mistral7B_Dolphin2.1_LIMARP0.5_4bpw_exl2
+Mistral or ChatML context presets both possible.
 full weights:
 https://huggingface.co/RossAscends/Mistral_7B_Dolphin2.1_LIMA0.5_fp16