<0x0A>

#7
by Omidh - opened

Not sure what I am doing wrong but while running the model with ollama I get a lot of <0x0A>'s.
Would appreciate any hint.

I replace <0x0A> with \n right now in code but still is it a config error or inconsistency in the training data?

Look at the vocabulary words (tokenizer). That hex code is like the first token in the model vocabulary?

https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0/blob/main/tokenizer.json

So, maybe your output biased to returning that token-number due to some coding error, or other issue.

Sign up or log in to comment