GGUF Inference API
Currently running zephyr-7B-alpha(Q4_K_M)
llama-cpp-python
/docs