--- datasets: - ekshat/text-2-sql-with-context language: - en library_name: transformers pipeline_tag: text-generation tags: - text-2-sql - text-generation - text2sql --- # Introduction Our Model is fine-tuned on Llama-2 7B model on Text-2-SQL Dataset based on Alpaca format described by Stanford. We have used QLora, Bits&Bytes, Accelerate and Transformers Library to implement PEFT concept. For more information, please visit [github.com/akshayhedaoo1](https://github.com/akshayhedaoo1/Llama-2-7b-chat-finetune-for-text2sql/tree/Data-Science) # Inference ```python !pip install transformers accelerate xformers bitsandbytes from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline tokenizer = AutoTokenizer.from_pretrained("ekshat/Llama-2-7b-chat-finetune-for-text2sql") # Loading model in 4 bit precision model = AutoModelForCausalLM.from_pretrained("ekshat/Llama-2-7b-chat-finetune-for-text2sql", load_in_4bit=True) context = "CREATE TABLE head (name VARCHAR, born_state VARCHAR, age VARCHAR)" question = "List the name, born state and age of the heads of departments ordered by age." prompt = f"""Below is an context that describes a sql query, paired with an question that provides further information. Write an answer that appropriately completes the request. ### Context: {context} ### Question: {question} ### Answer:""" pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200) result = pipe(prompt) print(result[0]['generated_text']) ``` # Model Information - **model_name = "NousResearch/Llama-2-7b-chat-hf"** - **dataset_name = "ekshat/text-2-sql-with-context"** # QLoRA parameters - **lora_r = 64** - **lora_alpha = 16** - **lora_dropout = 0.1** # BitsAndBytes parameters - **use_4bit = True** - **bnb_4bit_compute_dtype = "float16"** - **bnb_4bit_quant_type = "nf4"** - **use_nested_quant = False** # Training Arguments parameters - **num_train_epochs = 1** - **fp16 = False** - **bf16 = False** - **per_device_train_batch_size = 8** - **per_device_eval_batch_size = 4** - **gradient_accumulation_steps = 1** - **gradient_checkpointing = True** - **max_grad_norm = 0.3** - **learning_rate = 2e-4** - **weight_decay = 0.001** - **optim = "paged_adamw_32bit"** - **lr_scheduler_type = "cosine"** - **max_steps = -1** - **warmup_ratio = 0.03** - **group_by_length = True** - **save_steps = 0** - **logging_steps = 25** # SFT parameters - **max_seq_length = None** - **packing = False**