--- license: apache-2.0 language: - en pipeline_tag: text-generation model-index: - name: codify-llama-2-7b results: [] --- # codify-llama-2-7b This model is a fine-tuned version of [Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the [ALPACA_20k](https://raw.githubusercontent.com/sahil280114/codealpaca/master/data/code_alpaca_20k.json) dataset. ## Intended uses & limitations 1. Load the model as a Hugging Face Pipeline: ```Python from transformers import pipeline pipe = pipeline('text-generation', model='mohammedaly22/Codify-LLama-2-7b') ``` 2. Prepare the instruction template ```Python from string import Template prompt_template_inference = Template("""You are a world class software engineer answering coding questions. Below is an instruction that describes a coding task, paired with an optional input that provides further context. Write a response that accurately completes the task if the instruction is code-related, else, you should reponse that you don't know the answer as it is outside the context of coding. Note, you should stop generation after reaching the token. ### Instruction: $instruction ### Input: $input ### Response: """) ``` 3. Create an instruction prompt using the above template ```Python instruction = "Write a Python function that creates a simple 2-layer neural network using Keras for performing binary classification" input = "input shape of the neural network will be a vector of 200 elements" prompt = prompt_template_inference.substitute({"instruction": instruction, "input": input}) ``` This is the final instruction prompt that will be passed to the pipeline ``` You are a world class software engineer answering coding questions. Below is an instruction that describes a coding task, paired with an optional input that provides further context. Write a response that accurately completes the task if the instruction is code-related, else, you should reponse that you don't know the answer as it is outside the context of coding. Note, you should stop generation after reaching the token. ### Instruction: Write a Python function that creates a simple 2-layer neural network using Keras for performing binary classification ### Input: input shape of the neural network will be a vector of 200 elements ### Response: ``` 4. Passing the instruction prompt to the pipeline ```python output = pipe( prompt, do_sample=True, return_full_text=False, max_new_tokens=200, clean_up_tokenization_spaces=True ) ``` Here is the generated code of the model: ```python def build_simple_neural_network(): return Model( inputs=Input(shape=(200,)), outputs=Dense(2, activation="softmax"), name="simple_neural_network" ) ``` ## Training procedure ### BitsAndBytes hyperparameters - use_4bit: True - bnb_4bit_compute_dtype: "float16" - bnb_4bit_quant_type: "nf4" - use_double_nested_quant: False ### LoRA configurations - lora_r: 64 - lora_alpha: 16 - lora_dropout: 0.1 ### Training hyperparameters The following hyperparameters were used during training: - num_train_epochs: 1 - fp16: False - bf16: False - per_device_train_batch_size: 4 - per_device_eval_batch_size: 4 - gradient_accumulation_steps: 1 - gradient_checkpointing: True - max_grad_norm: 0.3 - learning_rate: 2e-4 - weight_decay: 0.001 - optim: "paged_adamw_32bit" - lr_scheduler_type: "cosine" - max_steps: -1 - warmup_ratio: 0.03 - group_by_length: True - save_steps: 0 - logging_steps: 50 ### Training results | Step | Training Loss | |:-----:|:-------------:| | 50 | 1.377900 | | 100 | 0.368700 | | 150 | 0.336600 | | 200 | 0.334800 | | 250 | 0.332300 | | 300 | 0.333700 | | 350 | 0.322100 | | 400 | 0.317000 | | 450 | 0.320800 | | 500 | 0.308400 | | 550 | 0.321900 | | 600 | 0.310700 | | 650 | 0.322100 | | 700 | 0.327700 | | 750 | 0.322000 | | 800 | 0.311300 | | 850 | 0.321800 | | 900 | 0.318700 | | 950 | 0.321600 | | 1000 | 0.314900 | | 1050 | 0.321700 | | 1100 | 0.307600 | | 1150 | 0.315800 | | 1200 | 0.316800 | | 1250 | 0.314200 | | 1300 | 0.310400 | | 1350 | 0.308000 | | 1400 | 0.318600 | | 1450 | 0.309700 | | 1500 | 0.307600 | | 1550 | 0.296800 | | 1600 | 0.305800 | | 1650 | 0.307400 | | 1700 | 0.327400 | | 1750 | 0.306100 | | 1800 | 0.309900 | | 1850 | 0.316300 | | 1900 | 0.299500 | | 1950 | 0.315700 | | 2000 | 0.307600 |