GGUF Models struggle with ChatML format in LMStudio for me

#3
by tculler91 - opened

Just curious if anyone else is experiencing issues using ChatML format with this model?

Despite updating LMStudio and trying out different gguf uploads, this model in 8-bit GGUF is generating random nonsense in ChatML mode.
For me, it will only generate normal text if you switch to Llama-3 format.

The same settings and ChatML format work fine when switching to Hermes 2 Pro though, which makes me wonder if this is a bug?
This is the first time I've experienced this with a Dolphin model.

dolphin 2.9.1.png

Cognitive Computations org

Yeah we're investigating this. You can use the original llama-3 template in LM-studio for now.

Just curious if anyone else is experiencing issues using ChatML format with this model?

Despite updating LMStudio and trying out different gguf uploads, this model in 8-bit GGUF is generating random nonsense in ChatML mode.
For me, it will only generate normal text if you switch to Llama-3 format.

The same settings and ChatML format work fine when switching to Hermes 2 Pro though, which makes me wonder if this is a bug?
This is the first time I've experienced this with a Dolphin model.

dolphin 2.9.1.png

It seems to be affecting more than just this model, a few other models i've tried trained that are on ChatML can't do so once quantized down but do fine in bf16. New model quirks πŸ˜Άβ€πŸŒ«οΈ

Just curious if anyone else is experiencing issues using ChatML format with this model?

Despite updating LMStudio and trying out different gguf uploads, this model in 8-bit GGUF is generating random nonsense in ChatML mode.
For me, it will only generate normal text if you switch to Llama-3 format.

The same settings and ChatML format work fine when switching to Hermes 2 Pro though, which makes me wonder if this is a bug?
This is the first time I've experienced this with a Dolphin model.

dolphin 2.9.1.png

It seems to be affecting more than just this model, a few other models i've tried trained that are on ChatML can't do so once quantized down but do fine in bf16. New model quirks πŸ˜Άβ€πŸŒ«οΈ

@saishf I concur.

Kearm changed discussion status to closed

Sign up or log in to comment