The missing of 'chat_template' in tokenizer_config.json leads to incorrect generation.
#4
by
devymex
- opened
Fix:
...
"add_prefix_space": null,
"chat_template": "{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}",
...
With the above fix, the following code works fine.
import torch
from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
from PIL import Image
model_dir = 'llava-hf/llava-v1.6-vicuna-7b-hf'
device_name = 'cpu' # 'cuda:0'
device = torch.device(device_name)
processor = LlavaNextProcessor.from_pretrained(model_dir)
model = LlavaNextForConditionalGeneration.from_pretrained(
model_dir, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True)
model.to(device)
image1 = Image.open('images/test.jpg')
conversaiton = [{
"role": "user",
"content": "<image>\nWhat's in the image?"
}]
text_prompt = processor.tokenizer.apply_chat_template(
conversaiton,
tokenize=False,
add_generation_prompt=True
)
inputs = processor(text_prompt, [image1], return_tensors='pt').to(device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=False))
Hi, I do this as you comment but it still doesn't work.
Judge Input: <s><|start_header_id|>user<|end_header_id|>
Inpainted image: <image>
Masked image: <image>
Caption: a black and red mountain bike parked on the side of a building<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Judge Output: <|start_header_id|>user<|end_header_id|>
Inpainted image:
Masked image:
Caption: a black and red mountain bike parked on the side of a building<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The code is:
# judge
judge_prompt = f'Inpainted image: <image>\nMasked image: <image>\nCaption: {caption}'
judge_chat_history.append({
'role': 'user',
'content': judge_prompt
})
judge_input = llava_processor.tokenizer.apply_chat_template(judge_chat_history, tokenize=False)
print(f'Judge Input: {judge_input}')
images = [image, mask_image]
judge_input = llava_processor(text=judge_input, images=images, return_tensors='pt').to(llava_judge.device)
judge_output_id = llava_judge.generate(**judge_input, max_new_tokens=100)
judge_output = llava_processor.decode(judge_output_id[0], skip_special_tokens=True)
print(f'Judge Output: {judge_output}')
Thanks for opening an issue, I will soon add templates in LLaVa configuration files for easier formatting. In the meanwhile, your solution is a possible workaround
Happen to find this amazing repo: https://github.com/chujiezheng/chat_templates/tree/main
I have tried the vicuna version. seem to work well.