fbaldassarri
/

modello-italia-9b-autoround-w4g128-cpu

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

fbaldassarri commited on Jun 21

Commit

43551de

•

1 Parent(s): 5492af4

Update README

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -22,7 +22,13 @@ This model has been quantized in INT4, group-size 128, and optimized for inferen
 This model has been quantized using Intel [auto-round](https://github.com/intel/auto-round), based on [SignRound technique](https://arxiv.org/pdf/2309.05516v4).
 ```
-python3 ./examples/language-modeling/main.py \
 --model_name  ./models/sapienzanlp_modello-italia-9b \
 --device 0 \
 --group_size 128 \

 This model has been quantized using Intel [auto-round](https://github.com/intel/auto-round), based on [SignRound technique](https://arxiv.org/pdf/2309.05516v4).
 ```
+git clone https://github.com/fbaldassarri/model-conversion.git
+```
+Then,
+```
+python3 main.py \
 --model_name  ./models/sapienzanlp_modello-italia-9b \
 --device 0 \
 --group_size 128 \