File size: 4,612 Bytes

a63a69a
 
 
 
 
 
 
 
 
 
 
31f1cbe
a63a69a

---
license: apache-2.0
language:
- en
pipeline_tag: text-generation
model-index:
- name: codify-llama-2-7b
  results: []
---

# codify-llama-2-7b
This model is a fine-tuned version of [Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the [ALPACA_20k](https://raw.githubusercontent.com/sahil280114/codealpaca/master/data/code_alpaca_20k.json) dataset.

## Intended uses & limitations
1. Load the model as a Hugging Face Pipeline:

```Python
from transformers import pipeline

pipe = pipeline('text-generation', model='mohammedaly22/Codify-LLama-2-7b')
```

2. Prepare the instruction template
```Python
from string import Template

prompt_template_inference = Template("""You are a world class software engineer answering coding questions. Below is an
instruction that describes a coding task, paired with an  optional input that
provides further context. Write a response that accurately completes the task if
the instruction is code-related, else, you should reponse that you don't know the answer
as it is outside the context of coding. Note, you should stop generation after reaching the <EOG> token.

### Instruction:
$instruction

### Input:
$input

### Response:
""")
```

3. Create an instruction prompt using the above template
```Python
instruction = "Write a Python function that creates a simple 2-layer neural network using Keras for performing binary classification"
input = "input shape of the neural network will be a vector of 200 elements"
prompt = prompt_template_inference.substitute({"instruction": instruction, "input": input})
```

This is the final instruction prompt that will be passed to the pipeline
```
You are a world class software engineer answering coding questions. Below is an
instruction that describes a coding task, paired with an  optional input that
provides further context. Write a response that accurately completes the task if
the instruction is code-related, else, you should reponse that you don't know the answer
as it is outside the context of coding. Note, you should stop generation after reaching the <EOG> token.

### Instruction:
Write a Python function that creates a simple 2-layer neural network using Keras for performing binary classification

### Input:
input shape of the neural network will be a vector of 200 elements

### Response:
```

4. Passing the instruction prompt to the pipeline
```python
output = pipe(
    prompt,
    do_sample=True,
    return_full_text=False,
    max_new_tokens=200,
    clean_up_tokenization_spaces=True
)
```

Here is the generated code of the model:
```python
def build_simple_neural_network(): 
    return Model(
      inputs=Input(shape=(200,)),
      outputs=Dense(2, activation="softmax"),
      name="simple_neural_network"
    )

<EOG>
```

## Training procedure

### BitsAndBytes hyperparameters
- use_4bit: True
- bnb_4bit_compute_dtype: "float16"
- bnb_4bit_quant_type: "nf4"
- use_double_nested_quant: False

### LoRA configurations
- lora_r: 64
- lora_alpha: 16
- lora_dropout: 0.1


### Training hyperparameters
The following hyperparameters were used during training:
- num_train_epochs: 1
- fp16: False
- bf16: False
- per_device_train_batch_size: 4
- per_device_eval_batch_size: 4
- gradient_accumulation_steps: 1
- gradient_checkpointing: True
- max_grad_norm: 0.3
- learning_rate: 2e-4
- weight_decay: 0.001
- optim: "paged_adamw_32bit"
- lr_scheduler_type: "cosine"
- max_steps: -1
- warmup_ratio: 0.03
- group_by_length: True
- save_steps: 0
- logging_steps: 50


### Training results

| Step  | Training Loss | 
|:-----:|:-------------:|
| 50    | 1.377900      |
| 100   | 0.368700      |
| 150   | 0.336600      |
| 200   | 0.334800      |
| 250   | 0.332300      |
| 300   | 0.333700      |
| 350   | 0.322100      |
| 400   | 0.317000      |
| 450   | 0.320800      |
| 500   | 0.308400      |
| 550   | 0.321900      |
| 600   | 0.310700      |
| 650   | 0.322100      |
| 700   | 0.327700      |
| 750   | 0.322000      |
| 800   | 0.311300      |
| 850   | 0.321800      |
| 900   | 0.318700      |
| 950   | 0.321600      |
| 1000  | 0.314900      |
| 1050  | 0.321700      |
| 1100  | 0.307600      |
| 1150  | 0.315800      |
| 1200  | 0.316800      |
| 1250  | 0.314200      |
| 1300  | 0.310400      |
| 1350  | 0.308000      |
| 1400  | 0.318600      |
| 1450  | 0.309700      |
| 1500  | 0.307600      |
| 1550  | 0.296800      |
| 1600  | 0.305800      |
| 1650  | 0.307400      |
| 1700  | 0.327400      |
| 1750  | 0.306100      |
| 1800  | 0.309900      |
| 1850  | 0.316300      |
| 1900  | 0.299500      |
| 1950  | 0.315700      |
| 2000  | 0.307600      |