The new checkpoints performance comparasion

#8
by Davidliudev - opened

Different from the author's notes, for me looks like the one with group size is worse than the one without group size with the same parameter setup. Not sure if i got something wrong
Wondering if anyone know the reason?

And also curious if the new model quality is better or worse than the previous .pt model.

In evaluations, lower = better. So the group size 128 is marginally better than the un-grouped version.

Or are you referring to actual inference results on your end?

Also, both the grouped and un-grouped should be better than the original model due to the implementation of true sequential quantization. Unfortunately I did not save the original evals so I can't provide a comparison.

Thanks. It seemed that the max new tokens affected the quality.
I accidentally set it to 2000 and the result messed up.
I change it to 200 for both and inference seemed ok now.

Oh another thing i noticed that is for ungrouped and the original model I never get OOM error (I use a 4090 with 24 GB VRAM).
But with grouped ones I might sometimes get OOM.
Maybe it eats more VRAM....

Anyway thanks for your clarification elinas

Yes as mentioned in the README and the file size itself, it will use 1GB more of VRAM by default, so 18GB without any context.

elinas changed discussion status to closed

Sign up or log in to comment