cgus
/

Qwen2.5-14B-Instruct-abliterated-v2-exl2

Text Generation

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

cgus commited on Nov 12, 2024

Commit

dc70074

·

verified ·

1 Parent(s): 8f1ffd7

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ Made with exllamav2 0.2.3 with the default dataset.
 Exl2 quants can be used with Nvidia RTX2xxx or newer GPUs on Windows/Linux or AMD on Linux.
 This model format works the best when a model fits your GPU, otherwise it's better to use GGUF versions.
 For example with RTX3060/12GB I could fit 4.5bpw/5bpw with Q6 cache and 16k context.
-They can be used with Text-Generation-WebUI, TabbyAPI and some other apps that have exllamav2 loader.
 # Original model card
 # huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2

 Exl2 quants can be used with Nvidia RTX2xxx or newer GPUs on Windows/Linux or AMD on Linux.
 This model format works the best when a model fits your GPU, otherwise it's better to use GGUF versions.
 For example with RTX3060/12GB I could fit 4.5bpw/5bpw with Q6 cache and 16k context.
+Use with with Text-Generation-WebUI, TabbyAPI or other apps that have exllamav2 loader.
 # Original model card
 # huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2