File size: 2,806 Bytes
1a7091c ccdf90b 1a7091c 7a69291 1a7091c 2287b22 1a7091c dbe570c ccdf90b dbe570c 58c578e dbe570c 1a7091c 58c578e 1a7091c 58c578e 1a7091c dbe570c 58c578e 1a7091c dbe570c 1a7091c ccdf90b 3a445fb ccdf90b 3a445fb ccdf90b 1a7091c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
---
tags:
- code
- coding
- python
- llama-2
- gptq
model-index:
- name: Llama-2-7b-4bit-python-coder
results: []
license: llama2
language:
- code
datasets:
- iamtarun/python_code_instructions_18k_alpaca
pipeline_tag: text-generation
---
# LlaMa 2 7b 4-bit Python Coder (GPTQ)👩💻
**LlaMa-2 7b** fine-tuned on the **python_code_instructions_18k_alpaca Code instructions dataset** by using the method **QLoRA** in 4-bit with [PEFT](https://github.com/huggingface/peft) library.
## Pretrained description
[Llama-2](https://huggingface.co/meta-llama/Llama-2-7b)
Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters.
Model Architecture Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety
## Training data
[python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca)
The dataset contains problem descriptions and code in python language. This dataset is taken from sahil2801/code_instructions_120k, which adds a prompt column in alpaca style.
### Framework versions
- PEFT 0.4.0
### Example of usage
```py
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load quantized model and tokenizer.
tokenizer = AutoTokenizer.from_pretrained("NurtureAI/llama-2-7b-int4-gptq-python")
model = AutoModelForCausalLM.from_pretrained(
"NurtureAI/llama-2-7b-int4-gptq-python",
load_in_4bit=True,
torch_dtype=torch.float16,
device_map=device_map,
)
# prepare prompt.
instruction="Write a Python function to display the first and last elements of a list."
prompt = f"""### Instruction:
Use the Task below and the Input given to write the Response, which is a programming code that can solve the Task.
### Task:
{instruction}
### Input:
### Response:
"""
# generate response.
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()
# with torch.inference_mode():
outputs = model.generate(input_ids=input_ids, max_new_tokens=100, do_sample=True, top_p=0.9,temperature=0.5)
print(f"Prompt:\n{prompt}\n")
print(f"Generated instruction:\n{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):]}")
```
### Citation
```
@misc {NurtureAI,
author = { Raymond Hernandez },
title = { NurtureAI/llama-2-7b-int4-gptq-python },
year = { 2023 },
url = { https://huggingface.co/NurtureAI/llama-2-7b-int4-gptq-python },
publisher = { Hugging Face }
}
``` |