Remove `tool_call` from special tokens

#6
by mgrella - opened
No description provided.

These are needed. Please add clarification

ricklamers changed pull request status to closed

How embarrassing! I apologize, I'm not sure what happened to my description.

The issue is that when using the Groq/Llama-3-Groq-8B-Tool-Use model in the Transformers library and serving it through Inference Endpoints, it appears that special tokens are not rendered correctly in the output (they appear as empty) as they were control tokens. This prevents recognition of whether they have been generated or not. This occurs with both the standard Transformers endpoint and the v1/completions and v1/chat/completions compatible endpoints.

As further evidence, after setting the suggested system prompt and then asking a question like "What are the xml tags for function calling?", the response is "The XML tags for function calling are '' and ''. They are used to wrap the function name and arguments, ensuring that the function is called correctly and securely."

In the llama.cpp server, you need to enable the --special flag to display them in the output.

I'm not certain if this is the intended behavior for the <tool_call> tag, as it's not even part of roles or other control tokens influencing the generation.

https://huggingface.co/transformers/v2.11.0/main_classes/tokenizer.html

We need to make sure they are not split by the tokenization process to guarantee that they are represented as a single token to the model when passing tool calls back to the model so they are indeed needed. If you want to get those XML tags to show up in generated output you should configure your inference engine like your --special example.

Sign up or log in to comment