Any chance of an int4 or quantised version?

#3
by smcleod - opened

I'd try to create one myself, but the bf/fp16 weights are too big for me to process 🤣

@smcleod may be you have abbility quantinize https://huggingface.co/inarikami/DeepSeek-V3-int4-TensorRT ?

you have to be crazy/desperate to want any lower quants than int4. For a model to be comparable at all to nonquant version even this I would call too low personally...

Sign up or log in to comment