Upload fixed ONNX weights
#3
by
Xenova
HF staff
- opened
Makes weights compatible with Transformers.js format, and ensures both files are < 2GB.
Code to generate:
wget https://huggingface.co/schmuell/phi3-int4/resolve/main/onnx/decoder_model_merged.onnx
wget https://huggingface.co/schmuell/phi3-int4/resolve/main/onnx/decoder_model_merged.onnx.data
import onnx
with open("decoder_model_merged.onnx", "rb") as f:
onnx_model = onnx.load(f)
file_name = 'model_q4.onnx'
onnx.save(onnx_model, file_name ,
save_as_external_data=True,
convert_attribute=False,
location=file_name + '_data',
all_tensors_to_one_file=True,
size_threshold=10_000_000,
)
Xenova
changed pull request status to
merged