cicdatopea
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ datasets:
|
|
5 |
|
6 |
## Model Details
|
7 |
|
8 |
-
This model is an int4 model with group_size 128 and symmetric quantization of [falcon-three-7b]() generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `` to use AutoGPTQ format, with revision
|
9 |
|
10 |
## How To Use
|
11 |
### INT4 Inference(CPU/HPU/CUDA)
|
@@ -19,7 +19,7 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
19 |
quantized_model_dir,
|
20 |
device_map="auto"
|
21 |
## revision="" ##AutoGPTQ format
|
22 |
-
## revision="" ##AutoAWQ format
|
23 |
)
|
24 |
text = "How many r in strawberry? The answer is "
|
25 |
inputs = tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(model.device)
|
|
|
5 |
|
6 |
## Model Details
|
7 |
|
8 |
+
This model is an int4 model with group_size 128 and symmetric quantization of [falcon-three-7b]() generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `` to use AutoGPTQ format, with revision `e9aa317` to use AutoAWQ format
|
9 |
|
10 |
## How To Use
|
11 |
### INT4 Inference(CPU/HPU/CUDA)
|
|
|
19 |
quantized_model_dir,
|
20 |
device_map="auto"
|
21 |
## revision="" ##AutoGPTQ format
|
22 |
+
## revision="e9aa317" ##AutoAWQ format
|
23 |
)
|
24 |
text = "How many r in strawberry? The answer is "
|
25 |
inputs = tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(model.device)
|