Add quantized GGUFs?
#2
by
MoonRide
- opened
Could you please add GGUFs for F16 and Q8_0, Q6_K, Q5_K_M, Q4_K_M quants? This way people won't have to download this F32 monster GGUF and quantize it by themselves.
UPDATE: Okay, I've found some quants here: https://huggingface.co/ggml-org - it would be still nice to have them from the original source, though (and also tested if model works as intended after conversion - I've encountered multiple GGUF models with tokenizer issues caused by missing added_tokens.json during conversion, for example).
You can also download float16
from the correct revision (and then convert them) if you don't want to download the big one for now. Will make sure to upload smaller ones next time! cadded_tokens.json
is deprecated and only kept for legacy.
MoonRide
changed discussion status to
closed