--- tags: - gptq - 4bit - int4 - gptqmodel - modelcloud - llama-3.1 - 8b - instruct license: llama3.1 --- This model has been quantized using [GPTQModel](https://github.com/ModelCloud/GPTQModel). - **bits**: 4 - **group_size**: 128 - **desc_act**: true - **static_groups**: false - **sym**: true - **lm_head**: false - **damp_percent**: 0.005 - **true_sequential**: true - **model_name_or_path**: "" - **model_file_base_name**: "model" - **quant_method**: "gptq" - **checkpoint_format**: "gptq" - **meta**: - **quantizer**: "gptqmodel:0.9.9-dev0" ## Example: ```python from transformers import AutoTokenizer from gptqmodel import GPTQModel model_name = "ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit" prompt = [{"role": "user", "content": "I am in Shanghai, preparing to visit the natural history museum. Can you tell me the best way to"}] tokenizer = AutoTokenizer.from_pretrained(model_name) model = GPTQModel.from_quantized(model_name) input_tensor = tokenizer.apply_chat_template(prompt, add_generation_prompt=True, return_tensors="pt") outputs = model.generate(input_ids=input_tensor.to(model.device), max_new_tokens=100) result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True) print(result) ``` ## lm-eval benchmark ``` | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr| |---------------------------------------|------:|------|-----:|----------|---|-----:|---|-----:| |arc_challenge | 1|none | 0|acc |↑ |0.5171|± |0.0146| | | |none | 0|acc_norm |↑ |0.5290|± |0.0146| |arc_easy | 1|none | 0|acc |↑ |0.8068|± |0.0081| | | |none | 0|acc_norm |↑ |0.7837|± |0.0084| |boolq | 2|none | 0|acc |↑ |0.8232|± |0.0067| |hellaswag | 1|none | 0|acc |↑ |0.5787|± |0.0049| | | |none | 0|acc_norm |↑ |0.7765|± |0.0042| |lambada_openai | 1|none | 0|acc |↑ |0.7091|± |0.0063| | | |none | 0|perplexity|↓ |3.6297|± |0.0805| |mmlu | 1|none | |acc |↑ |0.6421|± |0.0039| | - humanities | 1|none | |acc |↑ |0.5932|± |0.0069| | - formal_logic | 0|none | 0|acc |↑ |0.4206|± |0.0442| | - high_school_european_history | 0|none | 0|acc |↑ |0.7030|± |0.0357| | - high_school_us_history | 0|none | 0|acc |↑ |0.8039|± |0.0279| | - high_school_world_history | 0|none | 0|acc |↑ |0.8228|± |0.0249| | - international_law | 0|none | 0|acc |↑ |0.7686|± |0.0385| | - jurisprudence | 0|none | 0|acc |↑ |0.7685|± |0.0408| | - logical_fallacies | 0|none | 0|acc |↑ |0.7914|± |0.0319| | - moral_disputes | 0|none | 0|acc |↑ |0.7110|± |0.0244| | - moral_scenarios | 0|none | 0|acc |↑ |0.4536|± |0.0167| | - philosophy | 0|none | 0|acc |↑ |0.6913|± |0.0262| | - prehistory | 0|none | 0|acc |↑ |0.7037|± |0.0254| | - professional_law | 0|none | 0|acc |↑ |0.4739|± |0.0128| | - world_religions | 0|none | 0|acc |↑ |0.7953|± |0.0309| | - other | 1|none | |acc |↑ |0.7036|± |0.0079| | - business_ethics | 0|none | 0|acc |↑ |0.6400|± |0.0482| | - clinical_knowledge | 0|none | 0|acc |↑ |0.7094|± |0.0279| | - college_medicine | 0|none | 0|acc |↑ |0.6358|± |0.0367| | - global_facts | 0|none | 0|acc |↑ |0.3400|± |0.0476| | - human_aging | 0|none | 0|acc |↑ |0.6457|± |0.0321| | - management | 0|none | 0|acc |↑ |0.8544|± |0.0349| | - marketing | 0|none | 0|acc |↑ |0.8761|± |0.0216| | - medical_genetics | 0|none | 0|acc |↑ |0.7300|± |0.0446| | - miscellaneous | 0|none | 0|acc |↑ |0.8148|± |0.0139| | - nutrition | 0|none | 0|acc |↑ |0.7092|± |0.0260| | - professional_accounting | 0|none | 0|acc |↑ |0.5071|± |0.0298| | - professional_medicine | 0|none | 0|acc |↑ |0.7316|± |0.0269| | - virology | 0|none | 0|acc |↑ |0.5000|± |0.0389| | - social sciences | 1|none | |acc |↑ |0.7390|± |0.0077| | - econometrics | 0|none | 0|acc |↑ |0.4561|± |0.0469| | - high_school_geography | 0|none | 0|acc |↑ |0.8333|± |0.0266| | - high_school_government_and_politics| 0|none | 0|acc |↑ |0.8808|± |0.0234| | - high_school_macroeconomics | 0|none | 0|acc |↑ |0.6231|± |0.0246| | - high_school_microeconomics | 0|none | 0|acc |↑ |0.7437|± |0.0284| | - high_school_psychology | 0|none | 0|acc |↑ |0.8404|± |0.0157| | - human_sexuality | 0|none | 0|acc |↑ |0.7481|± |0.0381| | - professional_psychology | 0|none | 0|acc |↑ |0.6814|± |0.0189| | - public_relations | 0|none | 0|acc |↑ |0.6455|± |0.0458| | - security_studies | 0|none | 0|acc |↑ |0.7143|± |0.0289| | - sociology | 0|none | 0|acc |↑ |0.8259|± |0.0268| | - us_foreign_policy | 0|none | 0|acc |↑ |0.8200|± |0.0386| | - stem | 1|none | |acc |↑ |0.5601|± |0.0085| | - abstract_algebra | 0|none | 0|acc |↑ |0.3500|± |0.0479| | - anatomy | 0|none | 0|acc |↑ |0.6370|± |0.0415| | - astronomy | 0|none | 0|acc |↑ |0.7566|± |0.0349| | - college_biology | 0|none | 0|acc |↑ |0.7639|± |0.0355| | - college_chemistry | 0|none | 0|acc |↑ |0.4800|± |0.0502| | - college_computer_science | 0|none | 0|acc |↑ |0.5000|± |0.0503| | - college_mathematics | 0|none | 0|acc |↑ |0.3200|± |0.0469| | - college_physics | 0|none | 0|acc |↑ |0.4020|± |0.0488| | - computer_security | 0|none | 0|acc |↑ |0.7600|± |0.0429| | - conceptual_physics | 0|none | 0|acc |↑ |0.5574|± |0.0325| | - electrical_engineering | 0|none | 0|acc |↑ |0.6345|± |0.0401| | - elementary_mathematics | 0|none | 0|acc |↑ |0.4921|± |0.0257| | - high_school_biology | 0|none | 0|acc |↑ |0.7710|± |0.0239| | - high_school_chemistry | 0|none | 0|acc |↑ |0.5665|± |0.0349| | - high_school_computer_science | 0|none | 0|acc |↑ |0.7000|± |0.0461| | - high_school_mathematics | 0|none | 0|acc |↑ |0.4074|± |0.0300| | - high_school_physics | 0|none | 0|acc |↑ |0.4172|± |0.0403| | - high_school_statistics | 0|none | 0|acc |↑ |0.5278|± |0.0340| | - machine_learning | 0|none | 0|acc |↑ |0.4732|± |0.0474| |openbookqa | 1|none | 0|acc |↑ |0.3360|± |0.0211| | | |none | 0|acc_norm |↑ |0.4220|± |0.0221| |piqa | 1|none | 0|acc |↑ |0.7943|± |0.0094| | | |none | 0|acc_norm |↑ |0.7965|± |0.0094| |rte | 1|none | 0|acc |↑ |0.6968|± |0.0277| |truthfulqa_mc1 | 2|none | 0|acc |↑ |0.3439|± |0.0166| |winogrande | 1|none | 0|acc |↑ |0.7364|± |0.0124| | Groups |Version|Filter|n-shot|Metric| |Value | |Stderr| |------------------|------:|------|------|------|---|-----:|---|-----:| |mmlu | 1|none | |acc |↑ |0.6421|± |0.0039| | - humanities | 1|none | |acc |↑ |0.5932|± |0.0069| | - other | 1|none | |acc |↑ |0.7036|± |0.0079| | - social sciences| 1|none | |acc |↑ |0.7390|± |0.0077| | - stem | 1|none | |acc |↑ |0.5601|± |0.0085| ```