amirm's picture
Update README.md
46762df verified
metadata
license: llama3
inference: false

Description

4 bit quantization of meta-llama/Meta-Llama-3-8B-Instruct using GPTQ. We use the config below for quantization/evaluation and HuggingFaceH4/ultrachat_200k as the calibration data. The code is available under this repository.

bits: 4
damp_percent: 0.01
desc_act: true
exllama_config:
 version: 2
group_size: 128
quant_method: gptq
static_groups: false
sym: true
true_sequential: true

Evaluations

Below is a comprehensive evaluation and also comparison with casperhansen/llama-3-8b-instruct-awq using the awesome mosaicml/llm-foundry.

model_name core_average world_knowledge commonsense_reasoning language_understanding symbolic_problem_solving reading_comprehension
ISTA-DASLab/Llama-3-8B-Instruct-GPTQ-4bit 0.552944 0.584061 0.547598 0.663904 0.431017 0.538141
casperhansen/llama-3-8b-instruct-awq 0.531504 0.557663 0.528201 0.657211 0.391476 0.522971
Category Benchmark Subtask Accuracy GPTQ Accuracy AWQ Number few shot
symbolic_problem_solving gsm8k 0.721759 0.59818 0-shot
commonsense_reasoning copa 0.85 0.84 0-shot
commonsense_reasoning commonsense_qa 0.78706 0.782146 0-shot
commonsense_reasoning piqa 0.784004 0.781828 0-shot
commonsense_reasoning bigbench_strange_stories 0.764368 0.752874 0-shot
commonsense_reasoning bigbench_strategy_qa 0.680647 0.659677 0-shot
language_understanding lambada_openai 0.716476 0.717834 0-shot
language_understanding hellaswag 0.750647 0.753137 0-shot
reading_comprehension coqa 0.198797 0.109733 0-shot
reading_comprehension boolq 0.8263 0.836391 0-shot
world_knowledge triviaqa_sm_sub 0.590667 0.511333 3-shot
world_knowledge jeopardy Average 0.4975 0.489451 3-shot
world_knowledge american_history 0.535109 0.544794 3-shot
world_knowledge literature 0.622449 0.626531 3-shot
world_knowledge science 0.420168 0.390756 3-shot
world_knowledge word_origins 0.293151 0.271233 3-shot
world_knowledge world_history 0.616622 0.613941 3-shot
world_knowledge bigbench_qa_wikidata 0.684366 0.644358 3-shot
world_knowledge arc_easy 0.808923 0.808081 3-shot
world_knowledge arc_challenge 0.571672 0.571672 3-shot
commonsense_reasoning siqa 0.827533 0.814227 3-shot
language_understanding winograd 0.871795 0.860806 3-shot
symbolic_problem_solving bigbench_operators 0.547619 0.552381 3-shot
reading_comprehension squad 0.581552 0.58789 3-shot
symbolic_problem_solving svamp 0.68 0.57 5-shot
world_knowledge mmlu Average 0.668279 0.645874 5-shot
world_knowledge abstract_algebra 0.29 0.33 5-shot
world_knowledge anatomy 0.681481 0.651852 5-shot
world_knowledge astronomy 0.703947 0.671053 5-shot
world_knowledge business_ethics 0.67 0.68 5-shot
world_knowledge clinical_knowledge 0.750943 0.701887 5-shot
world_knowledge college_biology 0.784722 0.729167 5-shot
world_knowledge college_chemistry 0.47 0.46 5-shot
world_knowledge college_computer_science 0.56 0.54 5-shot
world_knowledge college_mathematics 0.36 0.28 5-shot
world_knowledge college_medicine 0.653179 0.635838 5-shot
world_knowledge college_physics 0.5 0.431373 5-shot
world_knowledge computer_security 0.78 0.75 5-shot
world_knowledge conceptual_physics 0.548936 0.557447 5-shot
world_knowledge econometrics 0.45614 0.482456 5-shot
world_knowledge electrical_engineering 0.668966 0.586207 5-shot
world_knowledge elementary_mathematics 0.439153 0.417989 5-shot
world_knowledge formal_logic 0.47619 0.412698 5-shot
world_knowledge global_facts 0.37 0.41 5-shot
world_knowledge high_school_biology 0.790323 0.754839 5-shot
world_knowledge high_school_chemistry 0.581281 0.507389 5-shot
world_knowledge high_school_computer_science 0.71 0.74 5-shot
world_knowledge high_school_european_history 0.745455 0.775758 5-shot
world_knowledge high_school_geography 0.823232 0.823232 5-shot
world_knowledge high_school_government_and_politics 0.917098 0.875648 5-shot
world_knowledge high_school_macroeconomics 0.635897 0.620513 5-shot
world_knowledge high_school_mathematics 0.407407 0.392593 5-shot
world_knowledge high_school_microeconomics 0.726891 0.714286 5-shot
world_knowledge high_school_physics 0.423841 0.410596 5-shot
world_knowledge high_school_psychology 0.842202 0.838532 5-shot
world_knowledge high_school_statistics 0.592593 0.513889 5-shot
world_knowledge high_school_us_history 0.852941 0.852941 5-shot
world_knowledge high_school_world_history 0.843882 0.831224 5-shot
world_knowledge human_aging 0.717489 0.713004 5-shot
world_knowledge human_sexuality 0.763359 0.70229 5-shot
world_knowledge international_law 0.793388 0.77686 5-shot
world_knowledge jurisprudence 0.814815 0.768519 5-shot
world_knowledge logical_fallacies 0.754601 0.773006 5-shot
world_knowledge machine_learning 0.553571 0.508929 5-shot
world_knowledge management 0.84466 0.834951 5-shot
world_knowledge marketing 0.92735 0.888889 5-shot
world_knowledge medical_genetics 0.81 0.78 5-shot
world_knowledge miscellaneous 0.825032 0.799489 5-shot
world_knowledge moral_disputes 0.739884 0.722543 5-shot
world_knowledge moral_scenarios 0.437989 0.38324 5-shot
world_knowledge nutrition 0.764706 0.735294 5-shot
world_knowledge philosophy 0.733119 0.713826 5-shot
world_knowledge prehistory 0.719136 0.719136 5-shot
world_knowledge professional_accounting 0.475177 0.485816 5-shot
world_knowledge professional_law 0.480443 0.449153 5-shot
world_knowledge professional_medicine 0.709559 0.676471 5-shot
world_knowledge professional_psychology 0.694444 0.676471 5-shot
world_knowledge public_relations 0.7 0.6 5-shot
world_knowledge security_studies 0.730612 0.718367 5-shot
world_knowledge sociology 0.830846 0.845771 5-shot
world_knowledge us_foreign_policy 0.86 0.85 5-shot
world_knowledge virology 0.542169 0.518072 5-shot
world_knowledge world_religions 0.812865 0.795322 5-shot
symbolic_problem_solving bigbench_dyck_languages 0.086 0.045 5-shot
language_understanding winogrande 0.764009 0.759274 5-shot
symbolic_problem_solving agi_eval_lsat_ar 0.3 0.278261 5-shot
symbolic_problem_solving simple_arithmetic_nospaces 0.466 0.458 5-shot
symbolic_problem_solving simple_arithmetic_withspaces 0.502 0.496 5-shot
reading_comprehension agi_eval_lsat_rc 0.731343 0.708955 5-shot
reading_comprehension agi_eval_lsat_lr 0.554902 0.560784 5-shot
reading_comprehension agi_eval_sat_en 0.81068 0.805825 5-shot
world_knowledge arc_challenge 0.582765 0.591297 25-shot
commonsense_reasoning openbook_qa 0.478 0.468 10-shot
language_understanding hellaswag 0.769468 0.771062 10-shot
bigbench_cs_algorithms 0.715151 0.687879 10-shot
symbolic_problem_solving bigbench_elementary_math_qa 0.533569 0.530922 1-shot