MetaIX
/

GPT4-X-Alpasta-30b-4bit

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MetaIX commited on May 10, 2023

Commit

4d4ffc8

•

1 Parent(s): 11c8d3f

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ GPT4-X-Alpasta-30b working with Oobabooga's Text Generation Webui and KoboldAI.
 <p><strong>What's included</strong></p>
 <P>GPTQ: 2 quantized versions. One quantized --true-sequential and act-order optimizations, and the other was quantized using --true-sequential --groupsize 128 optimizations (coming soon)</P>
-<P>GGML: 1 quantized version. One quantized using q4_1</P>
 <p><strong>GPU/GPTQ Usage</strong></p>
 <p>To use with your GPU using GPTQ pick one of the .safetensors along with all of the .jsons and .model files.</p>
@@ -31,10 +31,10 @@ GPT4-X-Alpasta-30b working with Oobabooga's Text Generation Webui and KoboldAI.
 <p><strong><font size="4">--true-sequential --groupsize 128</font></strong></p>
-<strong>Wikitext2</strong>: TBD
-<strong>Ptb-New</strong>: TBD
-<strong>C4-New</strong>: TBD
 <strong>Note</strong>: This version uses <i>--groupsize 128</i>, resulting in better evaluations. However, it consumes more VRAM.

 <p><strong>What's included</strong></p>
 <P>GPTQ: 2 quantized versions. One quantized --true-sequential and act-order optimizations, and the other was quantized using --true-sequential --groupsize 128 optimizations (coming soon)</P>
+<P>GGML: 2 quantized versions. One quantized using q4_1, and the other was quantized using q5_0.</P>
 <p><strong>GPU/GPTQ Usage</strong></p>
 <p>To use with your GPU using GPTQ pick one of the .safetensors along with all of the .jsons and .model files.</p>
 <p><strong><font size="4">--true-sequential --groupsize 128</font></strong></p>
+<strong>Wikitext2</strong>: 4.70257568359375
+<strong>Ptb-New</strong>: 9.323467254638672
+<strong>C4-New</strong>: 7.041860580444336
 <strong>Note</strong>: This version uses <i>--groupsize 128</i>, resulting in better evaluations. However, it consumes more VRAM.