Do we need BOS token before each turn of chat during finetuning?
Many thanks for the models. I'm confusing on how to prepare multi-turn chat data for finetuning.
On your README, there is an example for chat model usage:
You are an AI programming assistant, utilizing the DeepSeek Coder model, developed by DeepSeek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.
### Instruction:
['content']
### Response:
['content']
<|EOT|>
### Instruction:
['content']
### Response:
ref: https://github.com/deepseek-ai/deepseek-coder#3-chat-model-inference
Here there is no BOS token at the front of the second turn.
However in Llama's template, BOS token is added to the front of each turn:
"{% if message['role'] == 'user' %}"
"{{ bos_token + '[INST] ' + content.strip() + ' [/INST]' }}"
If I want to finetune deepseek-ai/deepseek-coder-6.7b-instruct
model, do I need to insert EOS
token before every turn of the dialog just like Llama did?
Or I can just follow the inference example on your README? (no need for EOS before every turn)
I've checked the finetuning scripts on the repo, but it only provide single-turn example.
ref: https://github.com/deepseek-ai/DeepSeek-Coder/blob/791c8e2c2c5f89032041010efa60776eb4306d58/finetune/finetune_deepseekcoder.py#L16
just follow the inference example in the README.
I see. Thanks for the feedback.