Any chance of an int4 or quantised version?

by smcleod - opened Dec 29, 2024

Discussion

smcleod

Dec 29, 2024

I'd try to create one myself, but the bf/fp16 weights are too big for me to process 🤣

dfsafdsf

Dec 29, 2024

@smcleod may be you have abbility quantinize https://huggingface.co/inarikami/DeepSeek-V3-int4-TensorRT ?

olborer

Jan 1

@smcleod may be you have abbility quantinize https://huggingface.co/inarikami/DeepSeek-V3-int4-TensorRT ?

you have to be crazy/desperate to want any lower quants than int4. For a model to be comparable at all to nonquant version even this I would call too low personally...

smcleod

Jan 14

@olborer that's not how param size vs quant type works. The larger the param size the less problems you have with lower quantisations and essentially any quant that's at least Q2_K_M of a larger parameter model will beat the smaller parameter version.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment