OPEA
/

Falcon3-7B-Base-int4-sym-inc

4-bit precision

intel/auto-round

Model card Files Files and versions Community

cicdatopea commited on Dec 13, 2024

Commit

8bb71d0

·

verified ·

1 Parent(s): ba0dd81

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ datasets:
 ## Model Details
-This model is an int4 model with group_size 128 and symmetric quantization of [falcon-three-7b]() generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `` to use AutoGPTQ format, with revision `` to use AutoAWQ format
 ## How To Use
 ### INT4 Inference(CPU/HPU/CUDA)
@@ -19,7 +19,7 @@ model = AutoModelForCausalLM.from_pretrained(
     quantized_model_dir,
     device_map="auto"
     ## revision="" ##AutoGPTQ format
-    ## revision="" ##AutoAWQ format
 )
 text = "How many r in strawberry? The answer is "
 inputs = tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(model.device)

 ## Model Details
+This model is an int4 model with group_size 128 and symmetric quantization of [falcon-three-7b]() generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `` to use AutoGPTQ format, with revision `e9aa317` to use AutoAWQ format
 ## How To Use
 ### INT4 Inference(CPU/HPU/CUDA)
     quantized_model_dir,
     device_map="auto"
     ## revision="" ##AutoGPTQ format
+    ## revision="e9aa317" ##AutoAWQ format
 )
 text = "How many r in strawberry? The answer is "
 inputs = tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(model.device)