parameters suggestions (temperature etc)

#6
by 2dts - opened

Hi again, I trying to run your model quantized by exllama2 ( bartowski/gorilla-openfunctions-v2-exl2 8bit variant)
But can't get "functions" response with example query, and suspect this is because wrong parameters, so my current parameters

settings.temperature = 0
settings.top_k = 50
settings.top_p = 0
settings.token_repetition_penalty = 1.05

and my current prompt:

You are an AI programming assistant, utilizing the Gorilla LLM model, developed by Gorilla LLM, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.
### Instruction: <<function>>[{"name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}}, "required": ["location"]}}]
<<question>>What's the weather like in the two cities of Boston and San Francisco?\### Response:

and my response, looks pretty halucionating:

What's the weather like in the two cities of Boston and San Francisco?
<jupyter_text>
Step 2: Use the `requests` library to obtain the current temperature for each city. The URL you need is https://api.openweathermap.org/data/2.5/weather, where you can specify the location (city name) and the API key. You will also need to add units=imperial as a parameter to get the temperature in Fahrenheit. Save these results into variables called boston_temp and san_francisco_temp.
<jupyter_code>
boston_url = f"https://api.openweathermap.org/data/2.5/weather?q=Boston&appid={API_KEY}&units=imperial"
san_francisco_url = f"https://api.openweathermap.org/data/2.5/weather?q=San%20Francisco&appid={API_KEY}&units=imperial"

response_boston = requests.get(boston_url)
response_san_francisco = requests.get(san_francisco_url)

boston_temp = response_boston.json()['main']['temp']
san_francisco_temp = response_san_francisco.json()['main']['temp']
<jupyter_output>
<empty_output>
<jupyter_text>
Step 3: Print out the temperatures from Step 2.
<jupyter_code>
print(f"The current temperature in Boston is {boston_temp}°F.")
print(f"The current temperature in San Francisco is {san_francisco_temp}°F.")
<jupyter_output>
The current temperature in Boston is 48.6°F.
The current temperature in San Francisco is 61.6°F.
<jupyter_text>
Step 4: Write a function that accepts the temperature in Fahrenheit and returns the temperature in Celsius. Call this function with the appropriate parameters to convert the temperatures obtained in Step 2. Save the converted temperatures into variables named boston_temp_celsius and san_francisco_temp_celsius.
<jupyter_code>
def fahrenheit_to_celsius(fahrenheit):
    return round((fahrenheit - 32) * 5/9, 2) 

Do you have any suggestions for top_p and other params ?

just tried hf variant (gorilla-llm/gorilla-openfunctions-v2) - same result, tried 16bit and 8bit loading.

this is my model loading and generation function"

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig


model = AutoModelForCausalLM.from_pretrained("gorilla-llm/gorilla-openfunctions-v2",
                                             device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("gorilla-llm/gorilla-openfunctions-v2")

def generate(prompt):
    input_ids = tokenizer([prompt], return_tensors="pt").to("cuda")
    output_ids = model.generate(
        **input_ids,
        max_length = 512,
    )
    return tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]

Also found that you have typo in readme :)

return f"{system}\n### Instruction: <<function>>{functions_string}\n<<question>>{user_query}\### Response: "

here is \n missed before ###Response

tried with fixed template, but still no success.

Gorilla LLM (UC Berkeley) org

Hi @2dts ! Thanks for your interest~

here is \n missed before ###Response

Thanks for flagging this! We've fixed our README accordingly!

We provide GGUF quantized models and example inference walkthrough on https://huggingface.co/gorilla-llm/gorilla-openfunctions-v2-gguf. We've also included evaluation on BFCL and comparison of different quantized types for reference.

Let us know if you encounter further issues!

CharlieJi changed discussion status to closed

Sign up or log in to comment