Fine-Tune Llama 2 Model Using qLORA for Custom SQL Dataset

Instruction fine-tuning has become extremely popular since the (accidental) release of LLaMA. The size of these models and the peculiarities of training them on instructions and answers introduce more complexity and often require parameter-efficient learning techniques such as QLoRA. Refer Dataset at aswin1906/llama2-sql-instruct-2k

Model Background

Model Inference

Refer the below code to apply model inference

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch, re
from rich import print

class Training:
    def __init__(self) -> None:
        self.model_name= "meta-llama/Llama-2-7b-chat-hf"
        self.dataset= "aswin1906/llama2-sql-instruct-2k"
        self.model_path= "aswin1906/llama-7b-sql-2k"
        self.instruction= 'You are given the following SQL table structure described by CREATE TABLE statement: CREATE TABLE "l" ( "player" text, "no" text, "nationality" text, "position" text, "years_in_toronto" text, "school_club_team" text ); Write an SQL query that provides the solution to the following question: '
        self.model = AutoModelForCausalLM.from_pretrained(
            self.model_path,
            load_in_8bit=False,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        self.tokenizer = AutoTokenizer.from_pretrained(self.model_path)
    
    def inference(self, prompt):
        """
        Prompting started here
        """        
        # Run text generation pipeline with our next model
        pipe = pipeline(task="text-generation", model=self.model, tokenizer=self.tokenizer, max_length=200)
        result = pipe(f'<s>[INST] {self.instruction}"{prompt}". [/INST]')
        response= result[0]['generated_text'].split('[/INST]')[-1]
        return response
    
train= Training()
instruction= re.split(';|by CREATE', train.instruction)
print(f"[purple4] ------------------------------Instruction--------------------------")
print(f"[medium_spring_green] {instruction[0]}")
print(f"[bold green]CREATE{instruction[1]};")
print(f"[medium_spring_green] {instruction[2]}")
print(f"[purple4] -------------------------------------------------------------------")
while True:
    # prompt = 'What position does the player who played for butler cc (ks) play?'
    print("[bold blue]#Human: [bold green]", end="")
    user = input()
    print('[bold blue]#Response: [bold green]', train.inference(user))

Contact aswin1906@gmail.com for model training code

aswin1906
/

llama-7b-sql-2k

Fine-Tune Llama 2 Model Using qLORA for Custom SQL Dataset

Model Background

Model Inference

output

Dataset used to train aswin1906/llama-7b-sql-2k