--- license: apache-2.0 --- #### Quantization config ``` "zero_point": true, "q_group_size": 128, "w_bit": 4, "version": "GEMM" ``` #### Script to AWQ quantization ``` from awq import AutoAWQForCausalLM from transformers import AutoTokenizer model_path = 'PATH_TO Poro-34B' quant_path = 'Poro-34B-AWQ' quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" } # Load model model = AutoAWQForCausalLM.from_pretrained(model_path, safetensors=True) tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) # Quantize model.quantize(tokenizer, quant_config=quant_config) # Save quantized model model.save_quantized(quant_path) tokenizer.save_pretrained(quant_path) ``` #### Work supported by https://datacrunch.io/ ##### Quantized by: gradjitta