Granite Model Does Not Generate `<|tool_call|>` Format in Response

#8
by jjaegii - opened

Description:

Hello,
I am attempting to use the Granite-3.0-8b-Instruct model to generate responses in the <|tool_call|> format for external function calls. While the model successfully generates responses in JSON format, it does not include the <|tool_call|> token.


Steps to Reproduce:

  1. The following prompt was used:
<|start_of_role|>available_tools<|end_of_role|>
[{'name': 'calculator.add', 'description': 'Adds two numbers.', 'parameters': {'type': 'object', 'properties': {'num1': {'type': 'integer', 'description': 'The first number'}, 'num2': {'type': 'integer', 'description': 'The second number'}}, 'required': ['num1', 'num2']}}]<|end_of_text|>
<|start_of_role|>system<|end_of_role|>The AI must only use the provided tools to respond.<|end_of_text|>
<|start_of_role|>user<|end_of_role|>What is 123 + 456?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>
  1. The modelโ€™s response was:
<|start_of_role|>assistant<|end_of_role|>{
  "tool": {
    "name": "calculator.add",
    "arguments": {
      "num1": 123,
      "num2": 456
    }
  }
}<|end_of_text|>

Expected Behavior:

The expected response format was as follows:

<|start_of_role|>assistant<|end_of_role|><|tool_call|>calculator.add({"num1": 123, "num2": 456})<|end_of_text|>

Additional Context:

  • The <|tool_call|> token is defined in special_tokens_map.json, so the model should ideally recognize and generate this token.
  • However, while the model generates valid JSON responses, it does not include the <|tool_call|> token in its output.

Questions:

  1. Is there any additional configuration required to make <|tool_call|> functional with the Granite model?
  2. Could you provide examples of prompt designs that would effectively guide the model to generate the <|tool_call|> format?
  3. Is the modelโ€™s inability to generate the <|tool_call|> token a limitation of its training data?

Thank you!
I would appreciate any additional guidance or debugging steps you can provide.

Additional Comment:

Hello,
I wanted to clarify that while the issue was submitted with an English prompt for your convenience, the actual tests were conducted using a Korean prompt. The English prompt in the issue description is a direct translation of the Korean prompt to make it easier for your team to review.

Here is the original Korean prompt I used during testing:

<|start_of_role|>available_tools<|end_of_role|>
[
    {
        "name": "calculator.add",
        "description": "๋‘ ์ˆซ์ž์˜ ํ•ฉ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.",
        "parameters": {
            "type": "object",
            "properties": {
                "num1": {
                    "type": "integer",
                    "description": "์ฒซ ๋ฒˆ์งธ ์ˆซ์ž"
                },
                "num2": {
                    "type": "integer",
                    "description": "๋‘ ๋ฒˆ์งธ ์ˆซ์ž"
                }
            },
            "required": [
                "num1",
                "num2"
            ]
        }
    }
]
<|end_of_text|>
<|start_of_role|>system<|end_of_role|>AI๋Š” ์ œ๊ณต๋œ ๋„๊ตฌ๋งŒ ์‚ฌ์šฉํ•˜์—ฌ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•ฉ๋‹ˆ๋‹ค.<|end_of_text|>
<|start_of_role|>user<|end_of_role|>123 + 456์€ ์–ผ๋งˆ์ธ๊ฐ€์š”?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>

The results I received using this Korean prompt are consistent with the issue described. Thank you for your attention, and please let me know if additional details or clarification are needed!

Additional Comment:

Hello,
I would like to provide additional details regarding the issue. I conducted tests using both English and Korean prompts based on the instructions from the IBM Granite Function Calling tutorial (link).

Here are the results:


English Prompt Test

Prompt:

<|start_of_role|>available_tools<|end_of_role|>
[
    {
        "name": "get_stock_price",
        "description": "Retrieve the current price of a given stock ticker.",
        "parameters": {
            "type": "object",
            "properties": {
                "ticker": {
                    "type": "string",
                    "description": "The stock ticker symbol (e.g., AAPL for Apple Inc.)."
                }
            },
            "required": ["ticker"]
        }
    }
]
<|end_of_text|>
<|start_of_role|>system<|end_of_role|>The AI must only use the provided tools to respond.<|end_of_text|>
<|start_of_role|>user<|end_of_role|>What is the current price of IBM stock?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>

Response:

<|start_of_role|>assistant<|end_of_role|><|tool_call|>{"name": "get_stock_price", "arguments": {"ticker": "IBM"}}<|end_of_text|>

Korean Prompt Test

Prompt:

<|start_of_role|>available_tools<|end_of_role|>
[
    {
        "name": "get_stock_price",
        "description": "์ฃผ์–ด์ง„ ์ฃผ์‹ ์‹ฌ๋ณผ์˜ ํ˜„์žฌ ๊ฐ€๊ฒฉ์„ ์กฐํšŒํ•ฉ๋‹ˆ๋‹ค.",
        "parameters": {
            "type": "object",
            "properties": {
                "ticker": {
                    "type": "string",
                    "description": "์ฃผ์‹ ์‹ฌ๋ณผ (์˜ˆ: AAPL์€ Apple Inc.)"
                }
            },
            "required": ["ticker"]
        }
    }
]
<|end_of_text|>
<|start_of_role|>system<|end_of_role|>AI๋Š” ์ œ๊ณต๋œ ๋„๊ตฌ๋งŒ ์‚ฌ์šฉํ•˜์—ฌ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.<|end_of_text|>
<|start_of_role|>user<|end_of_role|>IBM์˜ ํ˜„์žฌ ์ฃผ๊ฐ€๋Š” ์–ผ๋งˆ์ธ๊ฐ€์š”?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>

Response:

<|start_of_role|>assistant<|end_of_role|>{
  "tool": {
    "name": "get_stock_price",
    "arguments": {
      "ticker": "IBM"
    }
  }
}<|end_of_text|>

Observations

  1. English Prompt:

    • The <|tool_call|> special token was correctly included in the modelโ€™s response.
    • The response followed the expected format, indicating proper behavior.
  2. Korean Prompt:

    • The <|tool_call|> special token was not included in the modelโ€™s response.
    • Instead, the response was generated in a plain JSON format without the special token.
  3. Hypothesis:

    • The difference in behavior may be due to the modelโ€™s training data being more optimized for English prompts.
    • Korean prompts may not have sufficient training examples for <|tool_call|> generation.

Questions:

  1. Does the Granite model currently support <|tool_call|> generation for non-English prompts, particularly Korean?
  2. If support for non-English prompts is limited, are there plans to improve this functionality in the future?
  3. Are there any specific adjustments or configurations I can make to ensure <|tool_call|> is generated for Korean prompts?

Thank you for your assistance! Please let me know if additional details or clarification are needed.

Sign up or log in to comment