Gptq or GGUF

by XiphosLyacris - opened Jul 27, 2024

Discussion

XiphosLyacris

Jul 27, 2024

any chance for a GPTQ or GGUF conversion for this?

mradermacher

Jul 27, 2024

•

edited Jul 27, 2024

llama.cpp has issues with llama-3.1, so everybody is currently waiting for those to be fixed before quantising. with luck, it will be fixed in a day.

Mikael110

Jul 27, 2024

llama.cpp has issues with llama-3.1, so everybody is currently waiting for those to be fixed before quantizing. with luck, it will be fixed in a day.

That is true for the 8B and 70B versions of this model, but the 12B version is based on Mistral Nemo, not Llama 3.1. And Nemo is fully supported by llama.cpp currently. The same is true for the 123B version which is based on Mistral Large.

Undi95

NeverSleep org Jul 27, 2024

You're right, Mistral quant can be done, will probably do some today.
L3.1 will need to wait

Mikael110

Jul 27, 2024

•

edited Jul 27, 2024

You're right, Mistral quant can be done, will probably do some today.
L3.1 will need to wait

Thank you, I really appreciate all the work you guys have put into this. I'm looking forward to trying this out👍.
It looks quite promising. Especially given how good Nemo is to start with.

Diavator

Jul 27, 2024

You're right, Mistral quant can be done, will probably do some today.
L3.1 will need to wait

I look forward to your publication. I really hope you can also publish the best parameters for samplers.

Undi95

NeverSleep org Jul 27, 2024

L3.1 support got merged, I'm gonna do some static quant for the 4 model

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment