--- base_model: Qwen/Qwen2-1.5B-Instruct datasets: - devanshamin/gem-viggo-function-calling library_name: peft license: apache-2.0 pipeline_tag: text-generation tags: - trl - sft - generated_from_trainer model-index: - name: Qwen2-1.5B-Instruct-Function-Calling-v1 results: [] --- # Qwen2-1.5B-Instruct-Function-Calling-v1 This model is a fine-tuned version of [Qwen/Qwen2-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2-1.5B-Instruct) on [devanshamin/gem-viggo-function-calling](https://huggingface.co/datasets/devanshamin/gem-viggo-function-calling) dataset. ## Basic Usage ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # Load the model and the tokenizer model_id = "Qwen2-1.5B-Instruct-Function-Calling-v1" model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float32, device_map="auto") tokenizer = AutoTokenizer.from_pretrained(model_id) def inference(prompt: str) -> str: model_inputs = tokenizer([prompt], return_tensors="pt").to(device) generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512) generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] return response prompt = "What is the meaning of life?" messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) response = inference(prompt) print(response) ``` ## Tool Usage ### Basic ```python import json def get_prompt(tool: str, user_input: str) -> str: system = "You are a helpful assistant with access to the following tools. Use them if required - \n```json\n{}\n```" messages = [ {"role": "system", "content": system.format(tool)}, {"role": "user", "content": 'Extract the information from the following - \n{}'.format(user_input)} ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) return prompt tool = { "type": "function", "function": { "name": "get_company_info", "description": "Correctly extracted company information with all the required parameters with correct types", "parameters": { "properties": { "name": {"title": "Name", "type": "string"}, "investors": { "items": {"type": "string"}, "title": "Investors", "type": "array" }, "valuation": {"title": "Valuation", "type": "string"}, "source": {"title": "Source", "type": "string"} }, "required": ["investors", "name", "source", "valuation"], "type": "object" } } } input_text = "Founded in 2021, Pluto raised $4 million across multiple seed funding rounds, valuing the company at $12 million (pre-money), according to PitchBook. The startup was backed by investors including Switch Ventures, Caffeinated Capital and Maxime Seguineau." prompt = get_prompt(json.dumps(tool), input_text) response = inference(prompt) print(response) # ```json # { # "name": "get_company_info", # "arguments": { # "name": "Pluto", # "investors": [ # "Switch Ventures", # "Caffeinated Capital", # "Maxime Seguineau" # ], # "valuation": "pre-money $12M", # "source": "PitchBook" # } # } # ``` ``` ### Advanced ```python import re from enum import Enum from pydantic import BaseModel, Field # pip install pydantic from instructor.function_calls import openai_schema # pip install instructor # Define functions using pydantic classes class PaperCategory(str, Enum): TYPE_1_DIABETES = 'Type 1 Diabetes' TYPE_2_DIABETES = 'Type 2 Diabetes' class Classification(BaseModel): label: PaperCategory = Field(..., description='Provide the most likely category') reason: str = Field(..., description='Give a detailed explanation with quotes from the abstract explaining why the paper is related to the chosen label.') function_definition = openai_schema(Classification).openai_schema tool = dict(type='function', function=function_definition) input_text = "1,25-dihydroxyvitamin D(3) (1,25(OH)(2)D(3)), the biologically active form of vitamin D, is widely recognized as a modulator of the immune system as well as a regulator of mineral metabolism. The objective of this study was to determine the effects of vitamin D status and treatment with 1,25(OH)(2)D(3) on diabetes onset in non-obese diabetic (NOD) mice, a murine model of human type I diabetes. We have found that vitamin D-deficiency increases the incidence of diabetes in female mice from 46% (n=13) to 88% (n=8) and from 0% (n=10) to 44% (n=9) in male mice as of 200 days of age when compared to vitamin D-sufficient animals. Addition of 50 ng of 1,25(OH)(2)D(3)/day to the diet prevented disease onset as of 200 days and caused a significant rise in serum calcium levels, regardless of gender or vitamin D status. Our results indicate that vitamin D status is a determining factor of disease susceptibility and oral administration of 1,25(OH)(2)D(3) prevents diabetes onset in NOD mice through 200 days of age." prompt = get_prompt(json.dumps(tool), input_text) output = inference(prompt) print(output) # ```json # { # "name": "Classification", # "arguments": { # "label": "Type 1 Diabetes", # "reason": "The study investigated the effect of vitamin D status and treatment with 1,25(OH)(2)D(3) on diabetes onset in non-obese diabetic (NOD) mice. It also concluded that vitamin D deficiency leads to an increase in diabetes incidence and that the addition of 1,25(OH)(2)D(3) can prevent diabetes onset in NOD mice." # } # } # ``` # Extract JSON string using regex output = re.search(r'```json\s*(\{.*?\})\s*```', output).group(1) output = Classification(**json.loads(_output)['arguments']) print(output) # Classification(label=, reason='The study investigated the effect of vitamin D status and treatment with 1,25(OH)(2)D(3) on diabetes onset in non-obese diabetic (NOD) mice. It also concluded that vitamin D deficiency leads to an increase in diabetes incidence and that the addition of 1,25(OH)(2)D(3) can prevent diabetes onset in NOD mice.') ``` ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 10 - training_steps: 200 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 0.4004 | 0.0101 | 20 | 0.4852 | | 0.3624 | 0.0201 | 40 | 0.3221 | | 0.2855 | 0.0302 | 60 | 0.2818 | | 0.2652 | 0.0402 | 80 | 0.2592 | | 0.2214 | 0.0503 | 100 | 0.2463 | | 0.2471 | 0.0603 | 120 | 0.2358 | | 0.2122 | 0.0704 | 140 | 0.2310 | | 0.2048 | 0.0804 | 160 | 0.2275 | | 0.2406 | 0.0905 | 180 | 0.2251 | | 0.2445 | 0.1006 | 200 | 0.2248 | ### Framework versions ```text peft==0.11.1 transformers==4.42.3 torch==2.3.1+cu121 datasets==2.20.0 tokenizers==0.19.1 ```