elinas commited on
Commit
1c6c3af
1 Parent(s): 022b275

info on models

Browse files
Files changed (1) hide show
  1. README.md +23 -3
README.md CHANGED
@@ -9,13 +9,18 @@ https://github.com/qwopqwop200/GPTQ-for-LLaMa
9
 
10
  LoRA credit to https://huggingface.co/baseten/alpaca-30b
11
 
 
 
 
 
12
  # Update 2023-03-27
13
- New weights have been added. The old .pt version is no longer supported and has been replaced by a 128 groupsize safetensors file. Update to the latest GPTQ to use it.
14
 
15
- **alpaca-30b-4bit-128g.safetensors**
16
 
17
- Evals
18
  -----
 
 
19
  **c4-new** -
20
  6.398105144500732
21
 
@@ -25,6 +30,21 @@ Evals
25
  **wikitext2** -
26
  4.402845859527588
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  # Usage
29
  1. Run manually through GPTQ
30
  2. (More setup but better UI) - Use the [text-generation-webui](https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode). Make sure to follow the installation steps first [here](https://github.com/oobabooga/text-generation-webui#installation) before adding GPTQ support.
 
9
 
10
  LoRA credit to https://huggingface.co/baseten/alpaca-30b
11
 
12
+ # Update 2023-03-29
13
+ There is also a non-groupsize quantized model that is 1GB smaller in size, which should allow running at max context tokens with 24GB VRAM. The evaluations are better
14
+ on the 128 groupsize version, but the tradeoff is not being able to run it at full context without offloading or a GPU with more VRAM.
15
+
16
  # Update 2023-03-27
17
+ New weights have been added. The old .pt version is no longer supported and has been replaced by a 128 groupsize safetensors file. Update to the latest GPTQ version/webui.
18
 
 
19
 
20
+ Evals - Groupsize 128 + True Sequential
21
  -----
22
+ **alpaca-30b-4bit-128g.safetensors** [4805cc2]
23
+
24
  **c4-new** -
25
  6.398105144500732
26
 
 
30
  **wikitext2** -
31
  4.402845859527588
32
 
33
+ Evals - Default + True Sequential
34
+ -----
35
+
36
+ **alpaca-30b-4bit.safetensors** [6958004]
37
+
38
+ **c4-new** -
39
+ 6.592941761016846
40
+
41
+ **ptb-new** -
42
+ 8.718379974365234
43
+
44
+ **wikitext2** -
45
+ 4.635514736175537
46
+
47
+
48
  # Usage
49
  1. Run manually through GPTQ
50
  2. (More setup but better UI) - Use the [text-generation-webui](https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode). Make sure to follow the installation steps first [here](https://github.com/oobabooga/text-generation-webui#installation) before adding GPTQ support.