Phi-3 MoE mini 4k instruct raw GGUF

This is a GGUF version of https://huggingface.co/PhilipMay/Phi-3-mini-4k-instruct-LLaMAfied-8xMoE-raw

The source model is an 8x MoE version of microsoft/Phi-3-mini-4k-instruct. It is based on the Llamafied version vonjack/Phi-3-mini-4k-instruct-LLaMAfied of Gan Feng.

It was created with the help of mergekit.

I have included the gguf-imat.py script and imatrix\imatrix.txt configuration used for the conversion. This is based on FantasiaFoundry/GGUF-Quantization-Script, and tweaked to pad vocab to allow operation with this model.

This model has been tested to be functional with LlamaSharp, so should be compatible with any llama.cpp based solutions.

Downloads last month
35
GGUF
Model size
20.7B params
Architecture
llama

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API
Unable to determine this model's library. Check the docs .

Model tree for jamesburton/Phi-3-mini-4k-instruct-LLaMAfied-8xMoE-raw-GGUF

Quantized
(1)
this model