TheBloke commited on
Commit
84bb01a
1 Parent(s): dd97ca3

Initial GPTQ model commit

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -58,7 +58,7 @@ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
58
  import argparse
59
 
60
  model_name_or_path = "TheBloke/airoboros-13B-gpt4-1.2-GPTQ"
61
- model_basename = "airoboros-13b-gpt4-1.2-GPTQ-4bit-128g.act.order"
62
 
63
  use_triton = False
64
 
@@ -104,17 +104,17 @@ print(pipe(prompt_template)[0]['generated_text'])
104
 
105
  ## Provided files
106
 
107
- **airoboros-13b-gpt4-1.2-GPTQ-4bit-128g.act.order.safetensors**
108
 
109
  This will work with AutoGPTQ and CUDA versions of GPTQ-for-LLaMa. There are reports of issues with Triton mode of recent GPTQ-for-LLaMa. If you have issues, please use AutoGPTQ instead.
110
 
 
111
 
112
-
113
- * `airoboros-13b-gpt4-1.2-GPTQ-4bit-128g.act.order.safetensors`
114
  * Works with AutoGPTQ in CUDA or Triton modes.
115
  * Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
116
  * Works with text-generation-webui, including one-click-installers.
117
- * Parameters: Groupsize = 128. Act Order / desc_act = True.
118
 
119
  <!-- footer start -->
120
  ## Discord
 
58
  import argparse
59
 
60
  model_name_or_path = "TheBloke/airoboros-13B-gpt4-1.2-GPTQ"
61
+ model_basename = "airoboros-13b-gpt4-1.2-GPTQ-4bit-128g.no-act.order"
62
 
63
  use_triton = False
64
 
 
104
 
105
  ## Provided files
106
 
107
+ **airoboros-13b-gpt4-1.2-GPTQ-4bit-128g.no-act.order.safetensors**
108
 
109
  This will work with AutoGPTQ and CUDA versions of GPTQ-for-LLaMa. There are reports of issues with Triton mode of recent GPTQ-for-LLaMa. If you have issues, please use AutoGPTQ instead.
110
 
111
+ It was created with group_size 128 to increase inference accuracy, but without --act-order (desc_act) to increase compatibility and improve inference speed.
112
 
113
+ * `airoboros-13b-gpt4-1.2-GPTQ-4bit-128g.no-act.order.safetensors`
 
114
  * Works with AutoGPTQ in CUDA or Triton modes.
115
  * Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
116
  * Works with text-generation-webui, including one-click-installers.
117
+ * Parameters: Groupsize = 128. Act Order / desc_act = False.
118
 
119
  <!-- footer start -->
120
  ## Discord