Much appreciated
#1
by
sometimesanotion
- opened
While you clearly have this automated, thank you, it's appreciated! My quantization of the model has stayed between q3_k_m q4_k_m, with some experiments in keeping output tensors as high as q6_k. I appreciate the expert work here.
Thanks for your kind words :) Model selection is still manually done, but yes, it's automated from then on, other than when things go wrong (we even have status display at http://hf.tst.eu/status.html). And the expert work here mostly boils down to using defaults of llama.cpp, since there are essentially no useful tunable buttons anymore. But making an imatrix and lots of quants indeed is a lot of busywork, if its not automated.
mradermacher
changed discussion status to
closed