tsumeone
/

wizard-vicuna-13b-4bit-128g-cuda

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tsumeone commited on May 4, 2023

Commit

51d9bc6

•

1 Parent(s): 69d5df9

Create README.md

Files changed (1) hide show

README.md +9 -0

README.md ADDED Viewed

	@@ -0,0 +1,9 @@

+---
+library_name: transformers
+pipeline_tag: text-generation
+---
+Quant of https://huggingface.co/junelee/wizard-vicuna-13b tested working with Occam's KoboldAI/GPTQ.
+Someone made a Triton quant already here, but it will not work with Occam's KoboldAI/GPTQ fork: https://huggingface.co/fbjr/wizard-vicuna-13b-4bit-128g
+```python llama.py ./wizard-vicuna-13b c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors 4bit-128g.safetensors```