The tokenizer adds a special token '<|im_end|>' to solve the problem of non-stop generation when encountering <|im_end|>.

#16

Using vllm to infer 'Llama3-ChatQA-1.5-8B', it will continue to be generated when encountering the special token '<|im_end|>', as shown in the figure below. This PR adds <|im_end|> to the tokenizer, and you need to add mapping to generation_config.json.

8e4f01f676a0de25c1412b10172cfa9.png

@zjyhf To be clear, are you saying this model has incorrect mapping of tokenid 128010 to string value of "<|reserved_special_token_5|>"? If there are no incorrect mapping, then using vllm "stop" param to pass extra tokens you want to use as stop tokens in addition to EOS.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment