DavidAU/L3-Dark-Planet-8B-GGUF

Utochi

Dec 21, 2024

Just a simple question, whats the difference between the two types in the files?

Utochi

Dec 21, 2024

in simpler terms than whats in the model card please.. for us kinda clueless people

DavidAU

Owner Dec 22, 2024

The "max cpu" versions offload part of the model on to the cpu ; this results in less vram usage, but also lower token per second.
This optional quant also uses the cpu for "math" - which is slightly more accurate than "gpu" (video card) math.
The result is slightly better instruction following and output generation.

For creative usage:
This results in greater nuance / connection as well as connection between concepts, details, character and "world".

For problem solving:
Greater change the model will both understand your problem better and craft a better, more accurate answer.

DavidAU
/

L3-Dark-Planet-8B-GGUF

Max cpu/D_AU?