jinaai/reader-lm-1.5b · Hello world example got wrong output

Hi I am running the hello world example in your doc, but I think the output is sort of wrong.

# in ipython
In [4]: html_content = "<html><body><h1>Hello, world!</h1></body></html>"
   ...: 
   ...: messages = [{"role": "user", "content": html_content}]
   ...: input_text=tokenizer.apply_chat_template(messages, tokenize=False)
   ...: 
   ...: print(input_text)
   ...: 
   ...: inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
   ...: outputs = model.generate(inputs, max_new_tokens=1024, temperature=0, do_sample=False, repetition_penalty=1.08)
   ...: 
   ...: print(tokenizer.decode(outputs[0]))
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
<html><body><h1>Hello, world!</h1></body></html><|im_end|>

/home/ai/mambaforge/envs/readerlm/lib/python3.12/site-packages/transformers/generation/configuration_utils.py:567: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
  warnings.warn(
/home/ai/mambaforge/envs/readerlm/lib/python3.12/site-packages/transformers/generation/configuration_utils.py:572: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.8` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
  warnings.warn(
/home/ai/mambaforge/envs/readerlm/lib/python3.12/site-packages/transformers/generation/configuration_utils.py:589: UserWarning: `do_sample` is set to `False`. However, `top_k` is set to `20` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_k`.
  warnings.warn(
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
<html><body><h1>Hello, world!</h1></body></html><|im_end|>
<|im_start|>assistant
Hello, world!<|im_end|>

I am expecting the output of assistant to be # Hello, world!