Word of thanks and suggestions for more 120B models

by Xonaz81 - opened Feb 16

Feb 16

Hello there!

I recently found your collection and I love it! Thanks for all your great work.

Might I ask if you can consider adding the new IQ 2-bit XS quants for Goliath-120B and Tess-XL-v1.0-120B. These models seem to be even better from both benchmarks and anecdotal evidence posted by others. 120B is also more resistant to the loss of 2-Bit quants. Thanks in advance and keep up the good work :)

KnutJaegersberg

Owner Feb 16

I wasn't planning to make a lot of more models. Currently trying to upload a miqu 120b model. I also want to do bigweave. but that's pretty much it. there are now also others doing this kind of quantizations :)

KnutJaegersberg changed discussion status to closed Feb 16

Xonaz81

Feb 16

I wasn't planning to make a lot of more models. Currently trying to upload a miqu 120b model. I also want to do bigweave. but that's pretty much it. there are now also others doing this kind of quantizations :)

Okay that's fine. Do you use local or cloud to create these quants? and can you still tell me how (on what) you generated the imatrix files? :)

KnutJaegersberg

Owner Feb 18

local workstation, on cpu only, i9 but already 3 years old. I don't think ram requirements are high, though. with 100 chunks, making an imatrix on the 20k random words dataset takes 10 hours up to a day, depending on model size. quantization takes 1-2 hours.

KnutJaegersberg

Owner Feb 18

I simply took the file 20k_random_data.txt from this discussion

https://github.com/ggerganov/llama.cpp/discussions/5006

Xonaz81

Feb 18

appreciated! thanks

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment