Fine tuning chat template for llama 3.1 Please help

#31
by Cagatayd - opened

I am very confused about formatting while giving the data to the model for fine tune and I look at youtubed and everyone is doing a different format

For example, for LLama3.1, there is one that gives the below

1- formats the data in this way
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
{user_question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{model_answer}<|eot_id|>

2- <|im_start|>assistant
also as input and as output when instructing exists but not even on the <|im_start|> and llama's page

Link : https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1

3- some use this format
[INST] instruction context <[/INST],

4- some of them uses
messages = [
{"role": "system" , "content": "..........."},
{ "role": "user", " content": "............"},
]

5- Unclothe uses this format
instructions = examples["instruction"]
inputs = examples["input"]
outputs = examples["output"]
texts = []

Each one I watch on Youtube gives the data in a different format

While giving it as Instruction, text, as Q-A and fine tune it as plain text, which format is better for model to understand and which formats are the right for llama 3.1 base and instructs ?

How do I find out which input data template to apply for fine tuning
I ) - Instruction based
II ) - Q-A
III) - for Plain text

For Llama 3.1 base. and llama 3.1 instruct. Please help me to find right template

Thank you very much in advance

Cagatayd changed discussion status to closed

Sign up or log in to comment