GGUF Inference API

Currently running zephyr-7B-alpha(Q4_K_M)

llama-cpp-python