how to convert "Phi-3-mini-4k-instruct" to "Phi-3-mini-4k-instruct-onnx"?
It looks like "Phi-3-mini-4k-instruct-onnx" doesn't support fine-turning, so is it possible to fine-turn "Phi-3-mini-4k-instruct" then convert it to ONNX format?
If yes , is there a guide on how to convert ?
You can use ONNX Runtime GenAI's model builder to quickly convert your fine-tuned Phi-3-mini-4k-instruct
model to optimized and quantized ONNX models. This example should work for your scenario.
I fine-tuned ‘Phi-3-mini-4k-instruct’ (using LoRa) and it works pretty well. However, when quantizing it with the ONNX runtime GenAI's model builder (I used the following command python3 -m onnxruntime_genai.models.builder -i path_to_local_folder_on_disk -o path_to_output_folder -p int4 -e cpu -c cache_dir_to_store_temp_files
and inference it, the output of the quanitized model is complete nonsense. Does anyone else have this problem? Or a tip on how to tackle the problem?
The model builder does not currently support LoRA but support will be coming soon.