Edit model card

This repo includes a current version of Phi-3 that was quantized to AWQ using AutoAWQ. Currently hosting via the TGI docker image fails due to its fallback on AutoModel and that not being compatible with AWQ. Hosting on vLLM is recommended.

To run the model you need to set the trust-remote-code (or similar) flag. While the remote code comes from microsoft (see LICENSE information in the file) you should validate the code yourself before deployment.

!!!! Currently hosting on vLLM is not supported, as microsoft changed their rope scaling type in the original model !!!!

Downloads last month
54
Safetensors
Model size
682M params
Tensor type
I32
·
FP16
·
Inference API
Input a message to start chatting with jsincn/phi3-awq-test.
This model can be loaded on Inference API (serverless).