Text Generation
Transformers
Safetensors
English
llama
Inference Endpoints
text-generation-inference

GGML format

#1
by s3nh - opened

Hi, thanks that we can use this arch, its really cool. I made an ggml convertion and it pretty solid even on CPU inference.
(q4-q8)
https://huggingface.co/s3nh/StableBeluga-13B-GGML

Sign up or log in to comment