How to use in llama.cpp server

#15
by subbur - opened

How to use this chat template in llama.cpp server, should I copy paste, in Prompt template box?

NVIDIA org
β€’
edited May 15

Hi,
Unfortunately, I am not very familiar with llama.cpp server. I guess a new chat template needs to be implemented based on the prompt template we provide in the model card. You can also check the sample code for more details of the prompt template.

First of all, you need to convert it to gguf format. You can do it with this notebook https://colab.research.google.com/drive/1P646NEg33BZy4BfLDNpTz0V0lwIU3CHu#scrollTo=fD24jJxq7t3k. Afterwards, you can install llama.cpp and run the model. ./main -m model.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

Sign up or log in to comment